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ABSTRACT 

The thesis of this dissertation is that formal definitions 
of the syntax and semantics of computer languages are needed. 
This dissertation investigates two candidates for formally 
defining computer languages: 

(1) the formalism of canonical systems for defining 
the syntax of a computer language and its translation into 
a target language, and 

(2) the formalisms of the A-calculus and extended 
Markov algorithms as a combined formalism used as the basis 
of a target language for defining the semantics of a computer 
language. 

Formal definitions of the syntax and semantics of SN0B0L/1 
and ALGOL/60 are included as examples of the approach. 



Thesis Supervisor: Edward L. Glaser ' 

Title: Associate Professor of Electrical Engineering, M.I.T. , 

(currently Chairman, Department of Information and 

Computer Sciences, Case Western Reserve University) 



.-<«>* -?,; U«,rf f i ..--skis'" " ^ ,i **«#V«.i. -"V <pS3^»}$ u '*S , >N 



ACKNOWLEDGEMENT 

To Professor Edward Glaser, whose insight and imagination 
have sparked my enthusiasm and prompted many major develop- 
ments throughout this dissertation; 

To Professor John Wozencraft, whose warm guidance and 
penetrating criticisms have motivated a standard that this 
dissertation can only approximate; 

To Professor Robert Graham, whose practical understand- 
ing of computer languages has helped initiate and direct 
this dissertation; 

To Peter Landin, who patiently devoted hours teaching me 
his ideas on computer languages; 

To Professor John Donovan, for his collaboration on 
canonic systems; 

To Calvin Mooers, for many lively discussions on key 
issues; 

To Leon Groisser, for his wise and thoughtful comments on 
my life as a student; 

And to my parents, whose lifelong support has been in- 
valuable. 



"Work reported herein was supported (in part) by 
Project MAC, an M.I.T. research program sponsored 
by the Advanced" Research Projects Agency, Depart- 
ment of Defense, under Office of ; N»v*l Research 
Contract Number Mohr— 4102(01) ... Reproduction in 
whole or in part is permitted for *ny purpose of 
the United States Government. n 



A Virtuoso Typist: Mrs. Lila S. Hartmann 



STATEMENT OF OBIGIN 

I gratefully acknowledge the following men, upon whose 
work this dissertation is heavily based. In particular: 

a. The formalism of canonical systems is due to Emil 
Post and Raymond Smullyan. 

b. The application of "canonic" systems to specify the 
syntax of a computer language was first made by 
John Donovan. 

c. The notion of a defining canonical system and its 
use in formalizing derivations appeared earlier in 
works by Smullyan and Donovan. 

d. The formalism of the A-calculus is due to Alonzo 
Church. 

e. The application of the X-calculus to define a por- 
tion of the semantics of a computer language was 
first made by Peter Landin. 

f. The characterizations of the semantics of ALGOL/60 
and of the evaluator for the target language are 
based in part on similar characterizations by Landin, 

g. The formalism of Markov algorithms is due to A. A. 
Markov. 

h. The notion of adding string variables to Markov 
algorithms is due to A. Caracciolo. 

The application and integration of the above work to 
define the syntax and semantics of computer languages is the 
principal contribution of this dissertation. In particular: 

a. The application of canonical systems to define the 
translation of computer languages is due to the 
author. 

b. The application of defining canonical systems to de- 
fine notational abbreviations is new. 

c The notation for canonical systems and the uniform 
notation for defining canonical systems are for the 
most part new. 

d. The application of the X-calculus and (extended) 
Markov algorithms to define the primitive functions 
in a computer language is new. 

e. The application of (extended) Markov algorithms to 
define the operation of an evaluator for the target 
language for characterizing semantics is new. 

f. The definitions of the syntax and semantics of 
SBOBOL/1 and ALGOL/60 are new. 
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DEFINITIONS 



The following words are used like household words in 
this dissertation: 

Symbol: A character or any indivisible sequence of 
characters. 

Alphabet: A set of symbols. 

String: A sequence of symbols on an alphabet. 

Language: A set of strings. 

Syntax: The set of rules specifying the strings in a 
language. 

Semantics: The set of rules relating the strings in a 

language to the "behavior" or "objects" that 
the strings denote. For a computer language 
implemented by translating the strings in the 
language into strings in a target language, 
the behavior or objects that a string denotes 
is defined by the corresponding target lan- 
guage string, whose meaning is presumably 
understood. 

Translation: A function mapping one set of strings into 
another set of strings. 

Abbreviation: A bijective function mapping one set of 
strings (the unabbreviated strings) into 
another set of strings" (the abbreviated 
strings). The bijeetiveness of the function 
insures the unique reversibility of the map- 
ping. 




Machines should work, people should thin*. 



slogan from IBM television ond magazine advertisements 



CHAPTER I 
INTRODUCTION 

This dissertation has a thesis: that formal defini- 
tions of the syntax and semantics of computer languages are 
needed. The formal system presented here was developed as 
a step towards meeting this objective. 

There already exist formalisms, languages, and techniques 
for defining syntax and semantics. To be successful, a de- 
fining mechanism (or for that matter a computer language) 
should be simple, do clever things, and at the same time dis- 
play fundamental principles about the objects being defined. 
Most methods for defining computer languages do not satisfy 
these criteria. The objective of this dissertation was to 
attempt to meet these criteria, to develop a lucid and uniform 
method for defining computer languages. A formal approach to 
language definition was taken in the hope that this approach 
would gain a degree of precision, simplicity and theoretical 
power. Although these virtues are not completely satisfied 
in this dissertation, I believe the formal system presented 
here excels existing methods for defining the syntax and 
semantics of a computer language. The shortcomings of this 
approach to language definition and recommendations for 
future research in removing these shortcomings are discussed 
in the conclusions of Chapters II and III and in Chapter VI. 



Research generally progresses in two directions: in 
the development of new theories, and in the application and 
simplification of existing theories. This research is a 
study in the second direction. In particular, an attempt 
has been made to keep the notation and terminology of the 
formal system as simple as possible. It is natural for the 
author of a work to introduce notation, terminology, and 
conventions that became convenient for him to use, but which 
often obscure the work and its contributions to others. This 
author has tried to avoid this temptation. 

The formal system for defining syntax and semantics will 
be given in two parts. First, Chapter II presents the for- 
malism of canonical systems, which will be used to define the 
syntax of a computer language and its translation into an 
arbitrary target language. Second, Chapter III presents the 
formalisms of extended Markov algorithms and the A-calculus, 
which will be used as the basis for a particular target 
language for defining the semantics of a computer language. 
The semantics of the target language are specified, in turn, 
by giving an extended Markov algorithm definition of a func- 
tion for mapping a string in the target language into a 
string denoting its value. 

Chapters IV and V illustrate the formal system by de- 
fining the syntax and semantics of the computer languages 
SN0B0L/1 and ALGOL/60. In particular, Chapter IV describes 
SN0B0L/1 in the spirit of providing a reference manual for 
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SNOBOL/1, and is directed to the reader who wishes a detailed 
knowledge of the language. Chapter V not only explicates 
the formal definition of ALGOL/60 hut also relates the formal 
definition to other languages and other methods of language 
definition. Finally, Chapter VI contains a discussion of the 
utility of the formal system in defining computer languages. 
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CHAP TEH II 

CANONIOSL SYSTEMS: A SELF-EXTENDING FORMALISM 

FOR SPECIFYING THE SYNTAX OF A COMPUTER LANGUAGE 

AND ITS TRANSLATION INTO A TARGET LANGUAGE 

This chapter presents the formalism of canonkml systems 
and its application to define the syntax of a computer language 
and its translation into a target language. 

The mathematical underpinnings of canonical systems are due 
to Emil Post and Raymond Smullyan. 2 Canonical systems can be 
used to specify any "recursively enumerable" set. 2 The set 
of strings comprising all syntactically legal programs in a 
computer language and the set of pairs of strings comprising 
all syntactically legal programs in a computer language and 
their translations into a target language are Just two examples 
of recursvely enumerable sets. Presumably, canonical systems 
can specify any translation or algorithm that a machine can 
perform. Heuristic evidence that this statement is true is 
due to the works of Turing 30 ' 31 and Kleene. 32 In these works 
the notion of functions computable by a Turing machine were 
asserted to comprise every function or algorithm that is 
intuitively computable by machine, and the functions comput- 
able by a Turing machine were shown equivalent 31 ' 32 to the 
set of all "general recursive" sets, which are encompassed by 
canonicalsystems . 

The application of a logically modified variant of the 
formal systems of Post, 1 Smullyan, 2 and Trenchard More 38 to 
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specify completely the syntax of a computer language was first 
made by John Donovan. 3 * 5 Donovan applied his formal system 
to specify the set of legal programs in a computer language, 
including the specification of allowable character spacing, 
and more importantly, the specification of context-sensitive 
requirements on the set of legal programs, like the require- 
ment that all statement labels in a program be different. 

Donovan introduced the term "canonic systems" (in recog- 
nition of Post's work 1 ) to describe his formal system. Al- 
though Donovan's formal system is not used here, many ideas 
and techniques presented here have stemmed from Donovan's 
work. The name "canonical systems" is used to distinguish 
the formal system presented in this dissertation from the 
formal systems of Post, Smullyan and Donovan. A discussion 
of the theoretical background for canonical systems (as pre- 
sented here) is given in Appendix 5. The terminology for 
canonical systems presented here is due to both Post and 

Smullyan. 2 The notation for canonical systems presented here 

12 3 

is due in part to Post, Smullyan and Donovan, and is in 

large part new. Many hours were spent in developing the nota- 
tion presented here in the hope that the notation would be 
well-suited to computer languages. Discussions with Calvin 
Morers have had a major effect on the notation. 

To illustrate by example the techniques used in specify- 
ing the syntax and translation of a computer language with 
canonical systems, a small and rather useless subset of subset 
of ALGOL/60 will be taken as a source language, while IBM 
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System/360 assembler language will be taken as a target 
language. The Backus-Naur form specification of the ALGOL/60 
subset is given below: 



<DIGIT> ::= 1 
<VAB> ::= A 



2|3 
B 



<PRIMARY> ::= <DIGIT> | <VAR> 

<ARITH EXP> i:= <PRIMARY> | <ARITH EXP> + <PRIMARY> 

<STM> ::= <VAR> :=<ARITH BXP> 

<TYPE LIST> : := A ( B | A,B 

<DEC> ::= I»TEGER<TYPE LIST> 

<PROGRAM> ::= BEGIN <DEC> ; <STM> END 



This subset allows programs containing only one declaration 
and one limited type of arithmetic assignment statement. 

The rules for constructing a canonical system definition 
of a computer language, the rules for abbreviating a canonical 
system, and the rules for deriving strings defined by a 
canonical system will be presented informally in Section 2.1 
of this chapter using the English language. In Section 2.2 
these rules will be formally stated using the notion of a 
defining canonical system. In particular, each underlined 
expression in the next section will be defined formally in 
Section 2.2 with a defining canonical system. I now proceed 
to the informal definition of canonical systems and the appli- 
cation of this formalism to specify the syntax and translation 
of a computer language. 
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2.1 Canonical Systems 

2.1a The Basic Formalism 

A canonical system consists of a collection of the follow- 
ing items: 

(1) An alphabet A, called the object al phabet. 

(2) An alphabet P, called the Predicate alphabet. Each 
predicate in the predicate alphabet is assigned a 
unique positive integer called its degree. 

(3) An alphabet V, called the variable alphabet . 

(k) Another alphabet, which consists of six punctuation 
symbols, the implication sign, conjunction sign, 
tuple sign, delimiter sign, left bracket sign, and 
right bracket sign. 

(5) A finite sequence of strings that are well-formed 
productions, according to the definition given 
below. 

In a well-formed production, it is necessary to be able 
to determine the alphabet from which each symbol is drawn. 
Accordingly, I will use (a) lower case English letters (pos- 
sibly subscripted or superscripted) for variable alphabet 
symbols (b) strings of capital English letters, digits, and 
spaces, each separated by a tuple sign, for predicate alpha- 
bet symbols (c) the symbols 

-► implication sign 

, conjunction sign 

: tuple sign 

; delimiter sign 

< left bracket sign 

> right bracket sign 

for punctuation symbols, and (d) symbols not in alphabets (2), 

(3) and ( k ) for object alphabet symbols. 

A veil-formed term consists of a sequence of variable 

and object' alphabet symbols (e.g., "a+p" and "uv" ) . A 
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veil-formed term tuple consists of a sequence of terms each 
separated by a tuple sign and enclosed by a left and right 
bracket sign (e.g., "<a+p:uv>"). A veil-formed atomic formula 
consists of a predicate alphabet symbol folloved by a term 
tuple (e.g., "ARITH EXP : VARS<a+p :uv>" ) . A veil-formed pro - 
duction consists of (a) an atomic formula folloved by the 
delimiter sign (e.g., "ARITH 0P<+>;") or (b) a sequence of 
atomic formulas each separated by the conjunction sign and 
folloved by the implication sign, another atomic formula, and 
the delimiter sign (e.g., "PRIMARY : VARS<p :v> , 
ARITH EXP:VARS<a:u> - ARITH EXP:VARS<a+p: uv> ;" ) . An atomic 
formula occurring before the implication sign is called a 
premise. An atomic formula folloving the implication sign 
or occurring alone is called a conclusion . A production con- 
taining no premises is called an atomic production . 

In the specification of vritten expressions in computer 
languages, it vill often be necessary to include English 
letters, digits, spaces, and the punctuation symbols as mem- 
bers of the object alphabet. Since predicate alphabet charac- 
ters, the implication sign, conjunction sign, and delimiter 
sign cannot occur vithin the brackets of a term tuple, I 
adopt the convention that these symbols can be used in a term 
tuple as object alphabet symbols. Furthermore, let the quota- 
tion marks "*" and "'" be symbols not contained in the object 
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alphabet. Strings containing variable alphabet symbols, the 
tuple sign, left bracket sign and right bracket sign can 
also be used as members of the object alphabet provided that 
the strings are enclosed by the quotation marks when used 
vithin a production. For example, consider the following 
productions: 

VAR<A> ; 

VAR<*X'>; 

VAR<v> * ARITH BXP:VARS<v:v,> ; 

VAR<v>, ARITH:VARS<a:u> - ARITH EXP:VARS<a+v:uv,> ; 

Here, the symbols {A x + ,} enclosed in angle brackets are 
object alphabet symbols. The symbols {a v u} are variable 
alphabet symbols. 

A derivation is a string that can be obtained from a 
canonfcal system using the following two rules: 

(1) If e; is a production containing no premises, then 
the string c can be derived from the canonical sys- 
tem. 

(2) If p+c; is a production with premises p, and q-*-d; 

is an instance of this production with each variable 
in the production replaced by some object string, 
and each premise in q has been previously derived, 
then the string d can be derived from the canonic 
system. 

These rules can be applied to the previously given production 
to derive the strings 

VAR<A> VAR<x> 

ARITH EXP:VARS<A:A;> ARITH EXP: VARS<A+x+A: A,x,A,> ; 

The strings derivable from a canonical. system will be inter- 
preted in the following way. A predicate will be interpreted 
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as the name of a set; the term tuple following a predicate 
will he interpreted as a string that is a memher of the named 
set. In the ahove case, the set "VAR" contains two members, 
the strings "A" and "x". The set "ARITH EXP:VARS" contains 
an infinite number of members, some of which are "A:A," and 
"A+x+A:A,x,A," . Furthermore, I will follow the convention 
that each string of predicate characters separated by a tuple 
sign will be called a predicate part, and that predicates 
of degree k will consist of either one or k predicate parts. 
In the case where a predicate of degree k consists of k predi- 
cate parts ( e.g., "ARITH EXP:VARS"), each predicate part of the 
predicate will be some mnemonic describing the intended in- 
terpretation of the corresponding term in the associated term 
tuple (e.g., in the atomic production "ARITH EXP:VARS 
<a+p:uv>" the string "a+p" is interpreted as an arithmetic 
expression and the string "uv" is interpreted as the list of 
variables used in the arithmetic expression). The predicate 
parts and terms occurring after the tuple sign in an atomic 
production will be called "auxiliary" predicate parts and 
"auxiliary" terms (in the above case the term "uv" is the 
auxiliary term for the auxiliary predicate part "VARS" ) . 

For example, next consider the following canonical system 
specifying a set named "ARITH EXPrVARS", consisting of all 
pairs of strings such that the first element of each pair 
is an arithmetic expression in the subset of ALGOL/60, and 
the second element of each pair is a list of the variables 
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occurring in the arithmetic expression:* 



1.1 DIGIT<1>; 

1.2 DIGIT<2> ; 

1.3 DIGIT<3> ; 

2.1 VAR<A>; 

2.2 VAR<B>; 

3.1 DIGIT<d> ■*■ PRIMARY:VARS<d:A>; 

3.2 VAR<v> **■ PRIMARY:VARS<v:v,> ; 

3.3 PRIMARY :VARS<p:v> * ARITH EXP :VARS<p:v> ; 
3.k PRIMARY :VARS<p:v>, ARITH EXP:VARS<a:u> •*- ARITH EXP: VARS 

<a+p:uv> ; 

These productions can he interpreted: 

1.1 The symhol "l" is a member of the set named "DIGIT". 

1.2 The symbol "2" is a member of the set named "DIGIT". 

1.3 The symbol "3" is a member of the set named "DIGIT". 

2.1 The symbol "A" is a member of the set named "VAR". 

2.2 The symbol "B" is a member of the set named "VAR". 

3.1 If "d" represents a member of the set named "DIGIT , 

then the pair of strings denoted by "d:A" is a member of the 
set named "PRIMARY :VARS" . 

3.2 If "v" represents a member of the set named "VAR", 

then the pair of strings denoted by "v:v," is a member of the 
set named "PRIMARY :VARS" . 

3.3 If the pair "p:v" represents a member of the 

set named "PRIMARY: VARS" , 
then the pair of strings denoted by "p:v" is a member of the 

set named "ARITH EXP: VARS". 
3.U If the pair "p:v" represents a member of the set named 

"PRIMARY: VARS", 
and the pair "a:u" represents a member of the set named 

"ARITH EXP: VARS", 
then the pair of strings denoted by "a+p:uv" 
is a member of the set named 

"ARITH EXP: VARS". 



or more informally: 



•The symbol "A" denotes the null string, i.e., if P is a 
string then 

PA = P = AP 
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1. The symbols "l", "2" and "3" are digits. 

2. The symbols "A" and "B" are variables. 

3.1 If "d'| is a digit, 

■= o t*'S i! d " iS a P rimar y with » null list of variables. 

3.2 If " v " is a variable, 

, , r* e « ", V 1 ±B * P rimar y with » list "v," of variables. 
P n h a P rimar y with a list of variables M v", 
then 'p M is an arithmetic expression with the same list of 
variables "v" . 
3. U If "P" n is a primary with a list of variables'^" , 
and a is an arithmetic expression with a list of 
variables V, 

then "a+p" is an arithmetic expression with a list of 
variables "uv". 

The rules for deriving strings specified by a canonical 
system can be applied to these productions to conclude that 
(a) the set named "DIGIT" consists of three members, the 
symbols "1", "2" and "3", (b) the set named "PRIMARY :VARS" 
consists of five members, the pairs of string "l:A", 
"2:A\'3:A\ "A:A,\ and "B:B,\ and (c) the set named 
"ARITH BXPrVARS" contains an infinite number of members, 
some of which are "A:A,", "1+2: A", "A+B:A,B,", and 
"A+l+2+A+B:A,A,B,". 

Abbreviations to the Basic Notation: 

Using only the basic notation for a canonical system, a 
specification for a computer language often becomes lengthy. 
It will be convenient during the course of this dissertation 
to abbreviate some canonxal system constructions. Here, I 
introduce four simple and useful abbreviations, the first 
two of which are due to Donovan. 3 ' 5 The ability of canonical 
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systems to define abbreviations formally will be discussed 
in Section 2.2c. 

l.a If c, , c„. ... and c are conclusions with identical 
1 ' 2 n 

premises p, the productions 

p^c 1 ; p+c 2 ; . . . P"*-c n ; 
can be abbreviated 

: 1» c 2' 

l.b If c , c„, ... and c are conclusions with no premises, 
the productions 

°1 ; c 2 ; ••' C n ; 
can be abbreviated 

! 1» C 2* 

2. If <t >,<t >, ... and <t > are term tuples denoting 
members of the same set S, the atomic formulas 

s<t 1 >, s<t 2 >, ... , s<t n > 

can be abbreviated 

s<t 1 >,<t 2 >, ... ,<t n > 

3. If p.., p_, ... and p are premises with the same 
conclusion c, the productions 

P 1 ^c; p 2 -*c; ... P n -*c» 

can be abbreviated 

P x I P 2 I ••• I P n + c i 

k. If a and b are, different variables, and P and R are 
predicates, the productions 
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P<a> -»• R<a>; P<a>, R*b> -»■ B<ba> ; 
can be abbreviated 

P<a> •*• R<SEQ(a)>; 

Thus, the productions* 

(a) DIGIT<1>; DIGIT<2>; DIGIT<3>; 

(b) DIGIT<p> -»■ CHAR<p>; LETTER<p> -f CHAR<p> ; 

MARK<p> ■* CHAR<p> ; 

(c) DIGIT<d> ■+■ DIGIT STR<d>; DIGIT<d> , DIGIT STR<s> 

■*• DIGIT STR<sd>; 

can be abbreviated 



(a) DIGIT<1> ,<2>,<3>; 

(b) DIGIT<p> | LETTER<p> | MARK<p> ■*■ CHAR<p> ; 

(c) DIGIT<d> ■*■ DIGIT STR<SEQ(d)>; 



The abbreviated productions may informally be read: 

(a) The symbols "l" , n 2", and "3" are digits. 

(b) If p is a digit, ojr p is a letter, or p is a mark, 
then p is a character. 

(c) If d is a digit, then a sequence of digits is a digit 
string. 



2.1b Application to Specify Syntax 

I define the syntax of a language as the set of rules 
the 
specifying Strings in a language. The syntax of ALGOL/60 

has the requirement that the type of each variable used in 

program must be declared. This requirement is not handled 

by the Backus-Naur form specification of the ALGOL/60 subset 



•Productions (b) and (c) are from the canonical system defining 
the syntax of ALGOL/ 6"0. 
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given previously. For example, the syntactically illegal 
string 

BEGIN INTEGER B; A: = l END 

can be derived using this specification. This requirement 
can readily he handled with a canonical system definition of 
the subset by 

(a) specifying vith each statement an auxiliary term 
specifying the list of variables used in the 
statement, 

(b) specifying with each declaration an auxiliary term 
specifying the list of variables declared, and 

(c) adding a premise to the production for a legal 
program specifying that each variable occurring 
in the list in (a) must be contained in the list 
in (b). 

The canonical system for the subset of ALGOL/60 is given 
in Appendix 1.1a. There the second element in the term tuple 
for a primary, arithmetic expression, statement, and decla- 
tion specify the list of variables used or declared in the 
corresponding source language string. The restrictive premise 
"IN<u:v>" (production 5) insures that each of the variables 
in the list "u" is contained in the list of declared variables 
"v". For example, the following pairs of lists are members 
of the set named "IN" (productions 6) 

<A,:A,B,> <B:A,B,> <A,B,:A,B,> <A,B,A,B, : A,B,> 
Thus the string 
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BEGIN INTEGER A; A:»l END 

is specified by this canonical system, whereas the illegal 
string 

BEGIN INTEGER B; A:=l END 

is not specified by this canonical system because the pair 
<A,:B,> is not a member of the set named "IN". 

An Abbreviation for Specifying Syntax: 

In the specification of computer languages, it will be 
frequently necessary to write productions that specify auxil- 
iary lists with a given source language construction. For 
example, consider the productions from Appendix 1.1a 

3.1 DIGIT<d> ->• PRIMARY :VARS<d:A>; 

3.k PRIMARY:VARS<p:v>, ARITH EXP: VARS<a:u> 
-»■ ARITH EXP:VARS<a+p:uv>; 

Here the auxiliary terms corresponding to the predicate part 
"VARS" specify the list of variables used in each construction. 
Productions like these, in which 

(a) an auxiliary term for an auxiliary predicate part 
in a conclusion is given as "A", and the auxiliary 
predicate part does not occur in a premise (e.g., 
the auxiliary term "A" for the predicate part 
"VARS" in production 3.1), or 

(b) an auxiliary term for an auxiliary predicate part 
in a premise is a variable, and the auxiliary term 
for the same predicate part in a conclusion con- 
tains one occurrence of the variable (e.g., the 
variables "u" and "v" for the predicate part "VARS" 
in production 3.1*). 
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occur frequently in canonical systems for computer languages. 
It is convenient not to have to specify explicitly the auxil- 
iary terms and their predicate parts in these cases. I 
therefore introduce the following abbreviation: 

(a) If p is an auxiliary predicate part occurring only 

in the conclusion of a production, 
and the term t corresponding to p is given as null, 
then ":p" and ":t" can be deleted from the production, 

(b) If p is an auxiliary predicate part occurring in a 

premise and a conclusion, 
and the term t corresponding to the occurrence of 

p in the premise is given as a variable, 
and the term u corresponding to the occurrence of 

p in the conclusion contains one occurrence 

of the variable, 
and the variable does not occur elsewhere in the 

production, 
then the occurrence of ":p" and ":t" in the premise 

and the occurrence of the variable in the con- 
clusion can be deleted. 

Thus production 3.1 above can be abbreviated 

3.1 DIGIT<d> + PRIMARY :VARS<d:A>» 

3.1' DIGIT<d> •+• PRIMARY<d>; (use abr a) 

and production 3.U above can be abbreviated 

3.U PRIMARY :VARS<p:v>, ARITH EXP: VARS<a:u> 

->■ ARITH EXP:VARS<a+p:uv> ; 
3.U' PRIMARY<p>, ARITH EXP : VARS<a :u> ■+ ARITH EXP:VARS<a+p :u> ; 

(use abr b) 

3.U" PRIMARY<p>, ARITH EXP<a> -> ARITH EXP :VARS<a+p:A> ; 

(use abr b) 
3.V" PRIMARY<p>, ARITH EXP<a> -»• ARITH EXP<a+p>; (use abr a) 

To obtain the unabbreviated equivalent of a production 
to which this abbreviation has been applied, one can 
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(a) Write down the abbreviated production. 

(b) Write down the corresponding unabbreviated predi- 
cates used in the production. 

(c) Specify for each predicate part occurring only in 
the conclusion a corresponding null term. 

(d) Specify for each predicate part occurring both in 
a premise and in a conclusion a term that consists 
of a variable that does not occur elsewhere in the 
production. 

Using rule (c), the production corresponding to 

(prod 3.1') DIGIT<d> ->- PRIMARY<d> ; 
(predicates) DIGIT PRIMARY :VARS 

can be unabbreviated 

3.1 DIGIT<d> -> PRIMARY :VARS<d:A>; 

Using rule (d), the production corresponding to 

(prod 3. it'") PRIMARY<p>, ARITH EXP<a> -* ARITH EXP<a+p>; 
(predicates) PRIMARY : VARS ARITH EXP:VARS ARITH EXP:VARs' 

can be unabbreviated* 

PRIMARY :VARS<p:v>, ARITH EXP: VARS<a :u> - ARITH EXP : VARS<a+p :uv> ; 

To insure the unique reversibility of this abbreviation, the 

first predicate part of each different predicate must be 

different, and the order in which added variables occur within 

the conclusion must be immaterial. 

*The_variables "u" and "v" added to production 3.1+'" need not 
be identical to those given in production 3. 1 *. A production 
with different variables is equivalent 2 in that each defines 
the same set of strings. 
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Using this and the previously given abbreviations, the 
canonical system of Appendix 1.1a has been abbreviated into the 
canonical system of Appendix 1.1b. The abbreviated canonical 
system can be viewed quite differently from its unabbreviated 
equivalent. For example, consider the abbreviated productions 

3.2 1 VAR<v> ■*■ PRIMARY:VABS<v:v,> ; 
3.3' PRIMARY<p> ->• ARITH EXP<p> ; 

and their unabbreviated equivalents 

3.2 VAR<v> ■* PRIMARY:VARS<v:v,>; 

3.3 PRIMARY :VARS<p:v> •*■ ARITH EXP:VARS<p : v> ; 

In production 3.2, a new auxiliary term "v," is specified for 
the auxiliary predicate part T *VARS" and this auxiliary predi- 
cate and term are specified in the abbreviated production 
3.2'. In production 3.3, however, the auxiliary list of 
variables is carried unchanged from the premise to the con- 
clusion, and this list is not specified in the abbreviated 

production 3.3'. 

Furthermore, consider the production 

5. STM:VARS<s:u>, DEC:DEC VARS<d:v>, IN<u:v> 
-»- PROGRAM<BEGIN d; s END> ; 

Here the auxiliary lists of variables "u" and "v" are con- 
strained by the premise "IN<u:v>", and hence the auxiliary 
predicate parts and terms for these lists occur in both the 
abbreviated and unabbreviated productions. 
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Thus the auxiliary terms referring to the lists of vari- 
ables and their associated auxiliary predicate parts are explicitly 
specified only when a new variable is added to the list (produc- 
tions 3.2, 3.5 and k.2) or when the list is required to have 
certain properties (production 5.). In languages like 
SN0B0L/1 and ALGOL/60, where the number of auxiliary terms is 
large, the abbreviation just given markedly reduced the size 
of their canonical systems specifying syntax. 

2.1c Application to Specify Translation 

I define the translation of a language as the function 
mapping the strings in the language into strings in some 
ther language. This function can be specified by a canonical 
ystem specifying a set of pairs of strings, where the first 
element in each pair is a legal string in the source language, 
and the second element is a corresponding string in the 
target language. 

As in the previous section, I will illustrate this use 
f canonical systems by example. The specification of the syn- 
tax of the ALGOL/60 subset has been modified to specify not 
only the legal strings in the subset but also their trans- 
lation into IBM System/360 assembler language. This specifi- 
cation is given in Appendix 1.2a. There the term to the left 
of each ".." specifies some string in the ALGOL/60 subset, 
the term to the right of each " . . " specifies the representa- 
tion of the string in the target language. For example, 
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the following pair of strings is a member of the set named 
"PROGRAM" : 

BEGIN INTEGER A; A:=l END. . "ASSEMBLER LANGUAGE PROGRAM 

BALR 15,0 • SET BASE REGISTER 
USING *,15 "INFORM ASSEMBLER 
L l,=F'l' "LOAD 1 
ST 1,A "STORE RESULT IN A 
SVC "RETURN TO SUPERVISOR 

•STORAGE FOR VARIABLES 

A DS F 
END 

Note that this canonfeal system includes the specification of 
the comment entries in the assembler statements so that (hope- 
fully) the reader will not have to be familiar with the assembler 
language to understand the translation. 

An Abbreviation for Specifying Translation: 

Except for the specification of strings in assembler 
language, the canonical system defining the translation of the 
subset is identical to the canonical system defining the syntax 
of the subset. In general, since a definition of the syntax 
of a language specifies the legal strings in a language and 
a definition of the translation of a language specifies the 
legal strings as well as their representation in some other 
language, the definition of the translation of a language will 
encompass the definition of the syntax of a language. This 
similarity leads to the following abbreviation. 

Let numbers be placed on the productions of the canonical 
systems for the syntax and translation so that a production 
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specifying the translation of a string is given the same 
number as the corresponding production specifying the syntax 
of the string. Let p g and p t be identically numbered produc- 
tions from the canonical systems specifying respectively the 
syntax and translation. 

(a) If p g and p t are identical, then p can be omitted. 

(b) If a premise in p g and p are identical, then the 
premise in p can be omitted. 

(c) If an auxiliary predicate part and corresponding 
term of atomic formulas vith identical first predi- 
cate parts in p g and p are identical, then the 
auxiliary predicate part and term in p can be 
omitted. * 

For example consider the production from the syntax of 
the ALGOL/60 subset 

5. STM:VARS<s :u> , DEC-DEC VARS<d:v>, IN<u:v> 
•+ PROGRAM<BEGIN d; s EMD> ; 

and the corresponding production from the translation of the 
subset 

5.' STM:VARS<s. .s' :u>, DEC : DEC VABS<d. .d' : v> , IN<u:v> 
+ PROGRAM<BEGIN d; s END..a>; 

where a represents the string that specifies the translation 
of the program. Here, using rule (b), the premise "lN<u:v>" 
can be omitted from the translation production, and using 
rule (c) the auxiliary predicate parts and terms for the 
lists "u" and "v" of variables can be omitted to yield the 
abbreviated production for the translation 
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5." STM<s..s'>, DEC<d..d r > -»■ PROGRAM<BEGIN d; s END..a>; 

To obtain the unabbreviated equivalent of an abbreviated 
canonical system defining translation, one must add to the 
canonical system defining translation (a) the numbered pro- 
ductions that occur in the canonical system for the syntax 
but do not occur in the canonical system for translation (b) 
the premises that occur in a production for syntax but do not 
occur in the identically numbered productions for translation, 
and (c) for atomic formulas with identical first predicate 
parts, the auxiliary predicate parts and corresponding terms 
that occur in a production for syntax but do not occur in the 
identically numbered production for the translation. 

For example, consider the abbreviated translation pro- 
duction just given 

5.'' STM<s..s'>, DEC<d..d'> ■+ PROGRAM<BEGIN d; s END..a>; 

and the corresponding production for the syntax 

5. STM:VARS<s:u> , DEC:DEC VARS<d:v>, IN<u:v> 
-* PROGRAM<BEGIN d; s END> ; 

Here, the premise "lN<u:v>" occurs in the production for the 
syntax but not in the production for the translation, and the 
auxiliary predicate parts and corresponding terms for the pre- 
dicate parts "VARS" and "DEC VARS" occur in the production 
for the syntax but not in the production for the translation. 
Adding this premise and these auxiliary predicate parts and their 
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terms to the abbreviated production 5." for the translation, 
we obtain the unabbreviated production 

5.' STM:VARS<s..s»:u>, DEC:DEC VARS<d. . d' : v> , IN<u-v> 
- PROGRAM<BEGIN d; s EKD..a>; ^«<u.v> 

The abbreviated canonical system specifying the transla- 
tion of the ALGOL/60 subset is given in Appendix 2.1b. The 
abbreviated canonical system of Appendix 2.1b can be viewed 
quite differently from its unabbreviated equivalent. The 
abbreviated canonical need specify only the new terms that 
must be added to the canonical «ystem specifying the syntax 
in order to convert the canonical system specifying syntax 
into the canonical system specifying translation. In writing 
the abbreviated canonical system specifying translation, the 
requirements needed to insure the syntactic legality of a 
string whose translation is being specified can be omitted. 
These requirements are assumed ,to have been specified in 
the canonical system for the syntax. In languages like 
SN0B0L/1 and ALGOL/60, where the number of syntactic require- 
ments is large, this abbreviation greatly reduced the size 
of the canonical systems defining the translations of the 
languages into the target language. 

2 ' 2 Defining Canonical Systems 

2#2a The Motion of a., Defi ning Canonical System 

The previous sections have been devoted to developing 
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canonical systems specifying sets of strings. The strings 
represented syntactically legal programs in a subset of ALGOL/60 
and their counterparts in assembler language. The rules for 
forming and using the canonical systems for these sets were 
described informally in the text in English. The string repre- 
senting a canonical system and the rules for using the canoni- 
cal system can, in turn, be specified formally by another 
canonical system. In cases where a conflict would arise in 
distinguishing the strings of the first canonical system in 
the productions of the defining canonical system, the strings 
of the first canonical system can be enclosed by the quotation 

marks ,,v " and "'". 

The productions specifying the rules for constructing 
another canonical system are given in Appendix 1.3a. These 
productions specify the alphabets of object symbols, predicate 
symbols, and variable symbols, and the rules for constructing 
well-formed terms, term tuples, atomic formulas, premises, 
conclusions, productions, and finally, canonical systems. • 

The logical notion of using a second canonical system 
to formalize the rules for constructing a canonical system 



•In the productions of Appendix 1.3, the quotation marks have 
been omitted for matching pairs of left and right brackets 
that occur as object symbols. For example, in the atomic 
formula "WP TERM TUPLE<<t>>" , quotation marks have been omitted 
from the second and third brackets. In atomic formulas of 
this type, the scope of the left bracket sign extends to the 
matching right bracket sign, and all brackets thus enclosed 
are considered as object symbols. 
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was first presented by Smullyan 2 and later by Donavan. 3 In 
the works presented by Smullyan and Donavan, a notation dif- 
ferent from the basic notation is used in a defining canonical 
system. The advantages of using quotation marks to distinguish 
symbols in the defined canonical system from symbols in the 
defining canonical system are that (a) the same notation is 
used for all canonical systems, and (b) definitions and rules 
formalized in one canonical system can be copied and applied 
to other canonical systems independently of their position 
in a series of defined and defining canonical systems (this 
point will be discussed in section 2.2c). 

2,2b A PPlication to Deriv e Syntactically Legal Programs 

The rules for deriving strings specified by a canonical 
system can also be formalized with a defining canonical system. 
These rules are given in Appendix 1.3b. By adding a production 
of the form "CANONEAL SYSTEM STR<c>;\ where c is some well- 
formed canonical system, these productions define the rules 
for deriving strings in the canonical system c. 

In particular, productions 9 specify the rules for 
extracting productions from the member of the set "CAHOHICAL 
SYSTEM STR". Production 10 specifies the rule for substitut- 
ing strings in the object alphabet in place of the variables 
in the productions to obtain instances of the productions. 
Productions 11 specify the rules for deriving strings specified 
by the production instances. 
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Productions 10 and 11 can he viewed as a formalization 
of the two logical rites of inference "substitution" and "modus 
ponens" for deriving strings specified by a canonical system. 
The substitution of object strings for variables in a produc- 
tion occurs through the predicate "SUBST". The predicate 
"SUBST" define a set of U-tuples, where the first element of 
each k-tuple is a production, the second element is a variable, 
the third element some string of object alphabet symbols, and 
the fourth element the production with each occurrence of the 
variable replaced by the object string. For example, using 
the canonical system of the syntax of the ALGOL/60 subset as 
a member of the set " CANONICAL SYSTEM STR" , the following It- 
tuple can be generated as a member of the set "SUBST" 

<DIGIT<d>->-PRIMARY:VARS<d:A> : d : 1 : DIGIT<l>->-PRIMARY :VARS<1 :A >> 

The application of modus ponens to the production instances 
of a canonical system occurs in production 11.1. 

11.1 DERIVATIOH<A>; 

11.2 DERIVATION<d>, PROD INSTANCE<c ;> , WF C0NCLUSI0N<c> 

-+ DERIVATIONS c> J 

11.3 DERIVATION<d> , PROD INSTANCE<p->-c ;> , 

PREMS:DERIV CONT PREMS<p:d> ■+ DERIVATIONS c> ; 

These productions can be read: 

11.1 From no premises, the null string can be derived. 

11.2 If the string d has been derived, 

and c; is an instance of a production that contains no 

premises, 
then the string c can be added to the string d. 
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11.3 If the string d has teen derived, 

and p+c; is an instance*of a production with premises p, 
and the premises p are contained in the string d, 
then the string c can be added to the string d. 

For example, by successively using the following production 
instances 

DIGIT<1>; 

DIGIT<1> •* PRIMARY :VABS<1:A>; 

PRIMARY :VARS<1:A> •*■ ARITH EXP: VARS<1: A> ; 

the following member of the set "DERIVATION" can be generated 

DIGIT<1> PRIMARY :VARS<1:A> ARITH EXP : VARS<1: A> 

Another example of a member of the set "DERIVATION" is 
generated in the right-hand column of Appendix l.Ua. By simply 
asserting that the canonical system defining the syntax of the 
ALGOL/60 subset is a member of the set "CANONtjvl SYSTEM STR" 
(i.e., by simply adding the production "CANONIOL SYSTEM STR 
ODIOIT<1>; ... IN<y:£> + IN<xy :£>;'> ;" to the productions 
of Appendices 1.3a and 1.3b), Appendix 1.3 defines the rules 
for deriving syntactically legal programs in the ALGOL/60 
subset. The derivation of Appendix l.Ua specifies that the 
string BEGIN INTEGER A; A:=l END 

is a member of the set "PROGRAM". 

Yet another example of a member of the set "DERIVATION" 

is generated in the right-hand column of Appendix l.Ub. By 
*An instance of a production P is the production P* obtained 

from P by applying substitution to all of the variables in a 

production. 
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asserting that the canonical system defining the translation 
of the ALGOL/60 subset is a member of the set "CANON^*. SYSTEM 
STR", Appendix 1.3 defines the rules for deriving syntactically 
legal programs and their translation. The derivation of 
Appendix l.kt specifies that the string 

BEGIN INTEGER A; A:=l END. . 'ASSEMBLER LANGUAGE PROGRAM 

BALR 15,0 »SET BASE REGISTER 
USING *,15 "INFORM ASSEMBLER 
L 1,-F'l' *L0AD 1 
ST 1,A *ST0RE RESULT IN A 
SVC *RETURN TO SUPERVISOf 
•STORAGE FOR VARIABLES 
A DS F 
END 

is a member of the set "PROGRAM". 

Thus by simply adding a production asserting that some 
well-formed canonical system is a member of the set "CANONICAL 
SYSTEM STR", the productions of Appendix 1.3 can be used to 
generate all strings defined by the canonical system. 



Structural Description of Derived Strings: 
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A derivation provides a "structural description" of a 

35 
derived string. By a structural d«scription of a string, 

I mean the sequence of rules (here the sequence of productions ) 
used in generating the string. The sequence of rules used in 
generating a string provides information about the structure 
of the string. 



•This application is not used in the other sections of this 
dissertation. 
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For example, consider the derivation of Appendix l.lta. 
If we consider only the first term of each derived term tuple, 
the derivation provides a structural description for the string 
"BEGIN INTEGER A; A:=l END" that may. be represented in the 
form of a syntactic tree: 



PROGRAM 



BEGIN 




END 



INTEGER TYPE LIST 



VAR 



ARITH EXP 



PRIMARY 



DIGIT 



The tree can be constructed by scanning the derivation 
from bottom to top and constructing the corresponding tree 
from the top down. The leaves of the tree are symbols from 
the object alphabet. The nodes of the tree are the partial 
predicate names occurring in derived conclusions. The branches 
joining a node are determined by the basic symbols and the 
previously derived conclusions used to construct the newly 
derived conclusion. 
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Using a canonical system for the translation of a language, 
a derivation can be used to construct a structural description 
of a target language string. The System/360 assembler language 
is not a "structured" language and hence the derivation of an 
assembler language program is not of concern. However, canon- 
ical systems have been used** to obtain structural descriptions 
of strings in a target language where knowledge of a string's 
tree-like structure is important for its analysis.* 

2.2c Application to Specify Notat ional Abbreviations 

I define an abbreviation as a bijective (one-to-one and 
onto) function mapping one set of strings (the unabbreviated 
strings) into another set of strings (the abbreviated 
strings). The bi j ectiveness of the function insures that we 
can recover the unabbreviated equivalent of each abbreviated 
string. I have introduced six abbreviations to the notation 
for canonical systems, four to the basic notation, one for a 
canonical system specifying syntax, and another for a canoni- 
cal system specifying translation. Each of these abbrevia- 
tions can be specified by a defining canonical system speci- 
fying a set of ordered pairs, where the first element of 
each pair is an abbreviated canonical system, and the second 
element is the corresponding unabbreviated canonical system. 



*A canonical system derivation can lead to much more compli- 
cated structural descriptions than those that can be repre- 
sented in tree-like form. I have not studied this issue. 
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The productions specifying the six abbreviations intro- 
duced to canonical systems are given in Appendix 1.3c For 
example, productions 15.1 and 15.2 in 

15.1 WF PROD<p-*-c;> .->■ ABR1 P :P<p+c ; :p-»-c ;> ; 

15.2 WF PROD<p->-c;> , ABR1 P :P<p->-s ; :t> •*■ ABR1 P:P<p->-c ,s; :p+c ;t> ; 

15.3 WF ATOM PROD<c;> + ABR1 AP : AP<c ; : c ; > ; 
15. U WF ATOM PROD<c;>, ABR1 AP:AP<s;:t;> 

-»■ ABR1 AP:AP<s,c; :t; c;> j 

15.5 ABR1 CS:CS<A;A>; 

15.6 ABR1 CS:CS<c:d>, ABR1 P:P<p:q> -»• ABR1 CS : CS<cp : d<i> ; 

15.7 ABR1 CS:CS<c:d>, ABR1 AP:AP<p:q> -»• ABR1 CB :CS<cp:dq> ; 

specify a set of ordered pairs "ABR1 P:P", where the first 
element is a production of the f orm "p-*-c. , c p , ... , c ;" and 
the second element is the corresponding unabbreviated pro- 
ductions "p+c.. ; p-*-c_; ... p+c ;". Productions 15*3 and l^.k 
augment this set to include atomic productions, and produc- 
tions 15.5 through 15-.7 specify the abbreviation for an entire 
canonical system. 

Similarly, productions l6 through 20 specify the other 
five abbreviations to canonical systems.* Productions 2l and 

*To apply abbreviation 20, the abbreviation for a canonical 
system specifying syntax, a production of the form M CS PREDI- 
CATES^ ,p, .. . , p >" where the p., l<_i<n, are the unabbre- 
viated predicate ft>r the canonical system, must be added to 
productions 20. 

To apply abbreviation 21, the abbreviation for a canonical 
system specifying translation, (a) the productions and pre- 
mises occurring in the canonical system for syntax but not in 
the canonical system for translation must be added to the 
canonical system for translation, and (b) atomic formulas with 
identical first predicate parts from identically numbered 
productions from the canonical systems for the syntax and 
translation must be written together in the canonical system 
for translation and separated by "//". 
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2 2 specify abbreviations used in defining ALGOL/60 and will 
be discussed in the chapter on ALGOL/60. Finally, production 
23 specifies the rule for converting some string (presumably 
a well-formed abbreviated canonical system) that is asserted 
to be a member of the set "ABR CANONICAL SYSTEM STR" into the 
corresponding member of the set "CANONEK. SYSTEM STR" (the \jn_- 
abbreviated equivalent of 'the abbreviated canonical system).* 
For example, by asserting that the abbreviated canonical 
system of Appendix 1.1b is an abbreviated canonical system 
(i.e., by adding the production asserting that the canonical 
system of Appendix 1.1b is a member of the set "ABR CANONICAL 
SYSTEM STR"), the productions of Appendix 1.3c can be used to 
derive the conclusion that the canonical system of Appendix 
1.1a is its corresponding unabbreviated equivalent (i.e., the 
canonical system of Appendix 1.1a is a member of the set 
"CANONICAL SYSTEM STR"). Similarly, by asserting that the 
canonical system of Appendix 1.2b is a member of the set "ABR 
CANONICAL SYSTEM STR", production 2U. can be used to derive the 
conclusion that the canonical system of Appendix 1.2a is its 
unabbreviated equivalent.** In general, by 

•The order in which abbreviations are removed from an abbre- 
viated canonical system will generally depend on the abbrevia- 
tions introduced. Production 23. defines one order in which 
the abbreviations introduced in this dissertation can be 
removed. Furthermore, any premise in production 23 that 
refers to an abbreviation not used in a particular abbreviated 
canonical system can be removed. 

**As mentioned previously, an atomic production specifying the 
unabbreviated predicates of an abbreviated canonical system 
specifying syntax must be added to the defining canonical 
system to generate the correct unabbreviated (cont. next page) 
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(a) specifying the sets of ordered pairs defining 
some abbreviations, and 

(b) adding a production like production 23 defining 
the rule for converting an abbreviated canonical 
system into its unabbreviated equivalent. 

a defining canonical system can be used to generate the un- 
abbreviated equivalent of any abbreviated canonical system. 
Moreover, having generated the equivalent unabbreviated 
canonical system, the productions of Appendix 1.3a and 1.3b 
can then be used to derive strings specified by the canoni- 
cal system. 

The productions of Appendix 1.3 are written using only 
the first two abbreviations to the basic notation. To define 
Appendix 1.3 using only the basic notation, the user could 
write a third canonical system, which would consist of simply 

(a) a production asserting that the canonical system of Appen- 
dix 1.3 is a member of the set "ABR CANONICAL SYSTEM STR", 

(b) productions 15 and 16 of Appendix 1.3 (these productions 
contain no abbreviations), and (c) the production "ABR CANONICAL 
SYSTEM STR<a>, ABR2 CS:CS<a:b>, ABR1 CS:CS<b:c> •+ CANONICAL 
SYSTEM STR<c>;". The user would then have a series of three 
canonical systems. The first (abbreviated) canonical 

system (e.g., Appendices 1.1b or 1.2b) would define the allow- 
able strings in some source language. The 



**(Cont. from p. kl) canonical system, and the productions 
of the abbreviated canonical systems specifying syntax and 
translation must be combined (according to the rules given 
earlier) to generate the complete unabbreviated canonical 
system specifying translation. 
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second canonical system would define the rules for forming 
the first canonical system, the rules for deriving strings 
specified by the first canonical system, and the rules for 
converting the first canonical system into the basic notation. 
The third canonical system would define the rules for convert- 
ing the second canonical system into the basic notation. 
Thus, the series of canonical systems would ultimately be 
defined using only the basic notation. In general, a user 
may write a series of canonical systems to define the rules 
for constructing and using other canonical systems; in order 
for the series to be defined using only the basic canonical 
system notation , only the last member of the series need be 
written in the basic notation. 

Note that productions 15 and 16 of Appendix 1.3 could 
be copied unchanged in the third canonical system. These 
productions formalize rules that are applicable to two 
canonical systems independently of their relative positions 
in a series of canonical systems. In fact, these productions 
can be copied and applied to the canonical system in which 
they themselves are given. 

User-Coined Abbreviations: 

Defining canonical systems provides a writer of a canoni- 
cal system with a formal* mechanism for introducing his own 
abbreviations to the notation. For example, consider the prod- 
uctions (from the canonical system of ALGOL/60): 
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PRIMARY <p> ■+ TERM<p>; 

PRIMARY<p>, MULT OP<m> , TERM<t> ■* TERM<tmp> ; 

The user may wish to abbreviate these productions: 

PRIMARY<p>, MULT OP<m> ■*■ TERM<ALTSEQ(p m)>; 

Productions 21 of Appendix 1.3c Bpecify this abbreviation (as 
well as other variants of this abbreviation). Thus by simply 
adding new productions to the canonical system defining the 
conversion of a abbreviated canonical system to unabbreviated 
form, the notation for canonical systems can be tailored to 
fit a particular application. 

2. 3 Discussion 

Canonical systems have placed under a single framework 
the complete definition of the syntax and translation of a 
language. The formalism was used to specify all legal pro- 
grams, their translations into assembler language, the rules 
for deriving legal programs and their translations, and the 
rules for removing abbreviations from the specifications. 
Not once was it necessary to introduce concepts outside 
canonical systems; although some complexity was added to the 
formalism by introducing abbreviations to the basic notation, 
even the abbreviations were ultimately defined in terms of 
the basic formalism. 

It is important to develop languages whose descriptions 
are concise. The Backus-Naur form specification of the ALGOL/60 
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subset and the English sentence describing the context-sensi- 
tive requirement provide one very concise and easily under- 
standable description of the syntax of the subset. The 
canonical system of Appendix 1.1 has, in fact, been modeled 
after this description. Productions 1 through 5 correspond 
(except for the auxiliary elements generating the lists of 
used and declared variables) to the Backus-Naur form produc- 
tions; the premise M IN<u:v>" in production 5 and the defini- 
tion of the predicate "IN" formalize the context sensitive 
restriction stated in English. 

The canonical system of Appendix 1.1 is not much more 
lengthy than the Backus-Naur form definition of the subset 
and the associated English sentence describing the context- 
sensitive restriction. Like Backus-Naur form, the language 
of canonical systems is readable. On the other hand, canoni- 
cal systems have the added power to characterize completely 
both the syntax of a language and its translation into a 
target language, without resorting to the English Language. 
Moreover, the notation for canonical systems is not fixed. 
By changing or adding productions to a defining canonical 
system, the user can alter or abbreviate the notation for a 
defined canonical system to fit a particular language. 

I wish to point out two additional features of the 
canonical systems of Appendices 1.1 and 1.2. First, barring 
any inadvertent errors, the canonical systems describe a set 
of ALGOL/60 programs and assembler language programs that 
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will run on a computer when translated by an ALGOL/60 compiler 
or System/360 assembler. Second, the specification of the 
comments entries in the assembler language statements was 
provided not only to aid the reader. The comments are meaning- 
ful context-sensitive strings in the English language. The 
specification of these strings was handled as easily as the 
specification of the strings in assembler language. The 
specification of the strings in the English language illus- 
trates the use of canonical systems to specify the entire 
operation of a translator, including the specification of 
meaningful comments. Moreover, it suggests the capacity of 
canonical systems to define string transformations in lan- 
guages other than computer programming languages. 

One use of canonical systems is in the development of a 
generalized translator for computer languages, i.e., a trans- 
lator that is independent of both source and target languages. 
Canonical systems define a set by specifying rules for 
generating its members. To use a canonical system as a lan- 
guage for writing translators, an algorithm to recognize 
strings specified by a canonical system and output associated 
strings is needed. No algorithm for recognizing and construct- 
ing strings specified by a canonical system is presented in 
this dissertation. However, one algorithm for canonical 
systems has been devised and implemented by Alsop. 3 ^ 

Several important issues for using canonical systems in 
a generalized translator have not been studied. One critical 
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issue is the development of a restriction on canonical 
systems to define only recursive sets rather than recursively 
enumerable sets. Theoretically, an algorithm for recognizing 
a string defined by a canonical system exists only if the set 
of strings defined by the canonical system is recursive. 
Other critical issues include speed of translation, recovery 
in case of an error in a source language program, and code 
optimization of target language programs. I expect that 
modifications to the basic formalism presented here will be 
necessary to use canonical systems in a generalized trans- 
lator. 

The notion of defining canonical systems unfolds several 
possibilities for using canonical system as a tool for working 
with computer languages. Just as a canonical system allows 
a user to change a source or target language construction by 
simply changing the productions specifying the construction, 
a defining canonical system allows the user to change the 
definition or use of a defined canonical system by simply 
changing productions of the defining canonical system. Al- 
though only rules for removing abbreviations from a canonical 
system and rules for deriving strings specified by a canoni- 
cal system have been defined here, defining canonical systems 
may provide a flexible mechanism for embedding many other 
rules for defining and manipulating computer languages. 

As mentioned earlier, the results of this chapter apply 
to any recursively enumerable set. Any function or relation 
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that is recursively enumerable can be specified by a canoni- 
cal system. Canonical systems can be used to express algo- 
rithms and string transformations of a much different nature 
from those given here. The notion of defining canonical 
systems adds to the basic formalism a facility for allowing 
a user to formalize his own rules for defining and manipulat- 
ing strings and their canonical systems. The modifications 
to the basic formalism presented here have been directed 
towards the application of canonical systems to define the 
syntax and translation. of a language. But more importantly, 
canonical systems provides a definitional facility that the 
user has the freedom to tailor according to his own applica- 
tion and style. 
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CHAPTER III 

EXTENDED MARKOV ALGORITHMS AND X-CALCULUS: 

A COMBINED FORMALISM USED AS THE BASIS 

FOR A TARGET LANGUAGE FOR DEFINING SEMANTICS 

This chapter presents a formal language (henceforth 
referred to as the target language) quite different from con- 
ventional machine or assembler language for defining the 
semantics of a computer language. 

The semantics of a language can he defined as the set of 
rules relating the strings in a language to the behavior or 
objects that the strings denote. The behavior or object that 
a string denotes can be described by a string in some other 
language whose meaning is presumably understood. This approach 
to defining the semantics of computer languages will be taken 
in this chapter, namely, the presentation of a single language 
(whose meaning is presumably understood) for defining the 
semantics of multiple other languages. The semantics of a 
given source language will be specified by defining the trans- 
lation of the language into the target language. 

The semantics of the target language, however, will not 
be left to an English language explanation in the text. The 
semantics of the target language will be further explicated 
in Section 3.2 by giving a formal definition of a machine* 
that performs the computation indicated by a target language 



•"Machine" in the sense of a set of logical rules 
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string and produces the string denoted by the target language 
string. (In defining the semantics of a computer language, 
the word computation can he considered synonymous with the 
word "behavior" and all "objects" in a computer language can 
be considered as strings.) Thus the appeal to understanding 
the semantics of a computer language will be ultimately re- 
duced to understanding the formalism in which the operation of 
the target language evaluating mechanism is expressed. 

Generally, the semantics of different languages will be 
specified by giving different translations into the target 
language while leaving the definition of the target language 
evaluating mechanism unchanged. On the other hand, the defini- 
tion of the evaluating mechanism can be changed to define 
source language constructs that appear difficult to define in 
the target language.* 

The target language presented here is based on the 
formalism of Markov algorithms, 9 an extension to Markov algo- 
rithms due to Caracciolo, 10 ' 11 ' 12 and the formalism of the 
X-calculus of Alonzo Church. 17,18 Extended Markov algorithms 
are used to define the primitive functions in a computer 
language, the X-calculus is used to define new functions from 
the primitive functions. In a sense, the target language 
draws upon the best of each formalism. Markov algorithms 
explicate the notion of an algorithm operating on a string 

"This was done to define indirect addressing in SN0B0L/1. 
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and are especially well-suited to the definition of primitive 
functions transforming strings into new strings. The X- 
calculus explicates the notion of a function and is especially 
well-suited to the definition of new functions from the primi- 
tive functions. 

The target language has several important properties. 
The language is formally based, and theorems regarding the 
completeness of the formalisms to define the set of all "com- 
putable" function exist. 31 ' 32 The language is independent of 
the characteristics of existing computers. The basic notation 
for the target language is simple. Probably most importantly, 
the correspondence between many computer languages and the 
target language is somewhat simpler than the correspondence 
between computer languages and conventional machine or 
assembler languages. 

3.1 The Target Language 

3.1a Extended Markov Algorithms 

Markov Algorithms : 

Let A be an alphabet of characters, called the object 



alphabet, and let "-»■" , 



n . it it . ii 



•" and "A" be characters not in A. 



A Markov algorithm is a finite list of substitution rules of 

the form . 

s -*■(•) t 1 

s 2 + (•) t 2 

s +(•) t 
n n 
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wher« the s ± and t , l<i<n, are either "A" or strings of 
object alphabet characters, and "(•)" indicates the possible 
occurrence of a "•" after the "+" . The symbol "A" denotes 
the null string. 

A Markov algorithm of the above form when applied to an 
object string X is taken to mean: 

(a) Look down among the substitution rules for the 
first rule such that s. occurs in X. 

(b) If such a rule is found, replace the leftmost occur- 
rence of si in X by the string t.. If a "•" occurs 
after the "->■" in the substitutioi rule, terminate 
the algorithm. Otherwise repeat the application of 
the algorithm to the newly formed string. 

(c) If no auch rule is found, terminate the algorithm. 
For example, the Markov algorithm 



B ■*■ D 
C -* F 
-► I 



transforms the string "COBBLER" into the string "FIDDLER", 

s 

whereas the Markov algorithm 



B -»■ D 
C ->•• T 
-+ I 



transforms the string "COBBLER" into the string "TODDLER". 
Consider the following Markov algorithm for taking a 
parenthesized string of letters from the alphabet {I,0,N,X} 
and producing a string where the initial letters are reversed. 
(Here the character "#" is used as a marker, and the object 
alphabet consists of the characters (l N X ( ) #}.) 
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II* 

10* 
IN* 
IX* 

01* 
00* 
ON* 
OX* 

NI* 
NO* 
NH* 
NX* 

XI* 
XO* 
XN* 
XX* 

(I* 

(0* 
(N* 
(X* 

() 



1*1 


0*1 


N*I 


X*I 


1*0 


0*0 


N*0 


X*0 


I*N 


0*N 


N*N 


X*N 


I*X 


0*X 


N*X 


x*x 


K 


o( 


N( 


x( 


A 


*) 



A Markov algorithm for reversing 
string of letters {I N X} 



iranthes i zed 



53 



This algorithm when applied to the string "(NOXIN)" 
successively transforms it into the following strings 

(NOXIN) -»■ (NOXIN*) ■+ (N0XN*I) -»• (NON*Xl) -»■ (NN*OXl) 

-> (N»N0XI) * N(NOXI) ->■ N(NOXI») -»■ N(N0I*X) 

+ N(NI*OX) + N(I»N0X) •*■ NI(NOX) ■»■ Nl(NOX») 

+ NI(NX*0) -»■ NI(X*N0) -»■ NIX(NO) -»-NIX(N0*) 

+ NIX(0»N) -> NIXO(N) + NIXO(N») -► NIXON() 
■*•• NIXON 

Even quite simple algorithms like the above become exceed- 
ingly lengthy when expressed in the Markov formalism. If the 
alphabet above included all 26 letters in the English alphabet, 
the Markov algorithm for reversing the letters in a string 
would require TOk substitution rules. To alleviate this 
growth, Caracciolo di Forino ' 'in developing a Markov 
algorithm based language called PANON introduced the notion 
of a "string variable" as an extension to Markov algorithms. 

Extended Markov Algorithms: 

Let A and V be disjoint alphabets of characters, called 
respectively the object alphabet and variable alphabet, and 
let "->" , "." and "a" be characters not in A or V. Let each 
variable in V represent some pre-specif ied (possibly infinite) 
set of object alphabet strings. The case where different 
variables can represent different sets of object alphabet 
strings is not excluded. An extended Markov algorithm is a 
finite sequence of substitution rules of the 
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s l 


-(•) 


*1 


S 2 


■*(■) 


*2 


s 
n 


-(•) 


t 
n 



where the s. and t., l^iin, are either "A" or strings of object 
alphabet and variable alphabet characters such that each vari- 
able in t. occurs also in s.. 
1 i 

A string s. represents the set of object alphabet 
strings computed by concatenating in order from left to right 
each of the object alphabet characters in s i with any object 
alphabet string represented by a variable in s^ The set repre- 
sented bv s is constrained in that each occurrence of the 
i 

same variable in s. must be set to the same object alphabet 
string in computing the set of concatenated object strings 

that s. represents. For example, if A is a string variable 

l 

representing any member of the set {V W} and m is a string 
variable representing any member of the set {Y ZZ} the string 
"SLAmAi" represents any member of the set {VAYAV VAZZAV WAYAW 
WAZZAW}. 

A string s. is said to occur within an object string X 
& l 

if one or more of the strings represented by a^ occurs within 
X. The "leftmost" occurrence of s. in X is the string such 
that first, (of the occurrences of s. in X) the occurrence 
begins with the leftmost object alphabet character, and second, 
the occurrence is as short as possible. 

An extended Markov algorithm of the above form when ap- 
plied to an object string X is taken to mean: 
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(a) Look down among the substitution rules for the first 
rule in which s. occurs in X. 

(b) If such a rule is found, replace the leftmost oc- 
currence of a ± in X by the string obtained from tj 
by replacing each variable in t± by the string 
used in place of the variable in S£. If a "•" 
occurs after the '•-»■" in the substitution rule, 
terminate the algorithm. Otherwise repeat the ap- 
plication to the newly formed string. 

(c) If no such rule is found, terminate the algorithm.* 
It will be convenient to introduce a special symbol after the 
s.^ to mean that the string matched to s. must extend to the 
last character of the object string. I will use the symbol 
"•" for this purpose.** 

For example, let s and s* be string variables represent- 
ing any string of English letters. The extended Ma,rkov 
algorithm 

(1) si ■> sO 

transforms the string "BINGO" into the string "BONGO", the 
extended Markov algorithm 



(2) XsXs'X 



ss 



*The transformation specified by a substitution rule of an 
extended Markov algorithm is computable only if the string 
variables represent recursive sets. This requirement is 
discussed in detail by Caracciolo (Chap. 5, ref. 11). In 
this dissertation all sets defined for string variables are 
recursive. 

**This convention can be viewed solely within the framework of 
extended Markov algorithms by (a) replacing each "*" after 
the Si by a special character not in the object alphabet (b) 
replacing each corresponding t^ with t. followed by the spe- 
cial character (c) appending to each object string X the 
special character, and (d) applying to the transformed object 
string an algorithm that simply removes the special character. 
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transforms the string "XABXCDX" into the string "ABCD", the 
extended Markov algorithm 

(3) sXs -> X 

transforms the string "QABXAB" into the string "QX", and 
the extended Markov algorithm 

(h) Xs. ■*■ A 

sX -»•• X 

transforms the string "?VWXX?XBC" into the string "?XX?".» 

More precisely, an extended Markov algorithm will be 
specified in three parts: 

(a) A statement listing some string variables and the 
names of the sets whose members the variables 
represent . 

(b) A formal definition of the sets named in (a). 

(c) A list of extended Markov algorithm substitution 
rules including possible occurrences of the de- 
fined string variables. 

I will use statements of the form " | a 1 » a 2 »'" a ji EA ' b i> b 2'"' 
b m eB | ... |p 1 ,P 2 ,...P n eP I ". where the a., h ±% ... , and V± 
are variables and the A, B, ... , and P are the names of the 
sets, to denote that & x represents members of the set named 
A, a represents members of the set named A, etc. I will use 
canonical systems to define the named sets. Using this nota- 
tion the above extended Markov algorithms are more precisely 



•Note that the character "?" is not an English letter. 
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stated 

| s,s' e LETTER STR | 

LETTER STR<A>,<B>, ... ,<Z>; 

LETTER STR<a>,<b> ■*■ LETTER STR<ab>; 

(1) si ->■ sO 

(2) XsXs'X ->• 33< 

(3) sXs •*• X 

(k) Xs. -*• X 

sX -»■• X 

Consider again the algorithm for reversing any parenthe- 
sized string of letters from the alphabet {I X N}. Using 
the following variable and set definitions 

| c,d e LETTER | 
LETTER<I> ,<0> ,<N> ,<X> ; 

the extended Markov algorithm for this string transformation 
can now be simply given 



cd# -»■ d*c 

(c» -> c( 

() - A 

) + •) 



Note that by simply augmenting the set named "LETTER" (and 
the object alphabet) to include all the letters of the English 
alphabet, the same four extended Markov algorithm substitution 
rules define the algorithm for reversing a string containing 
all English letters, whereas 70U substitution rules are re- 
quired to define this transformation with a Markov algorithm. 
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Even with the extension to Markov algorithms given 
above, algorithms expressed in the extended Markov formalism 
often become exceedingly lengthy. One frequently occurring 
source of this lengthening is a requirement to construct the 
functional composition of two or more algorithms. Although 
Markov's monograph defines the additional substitution rules 
for taking two Markov algorithms and constructing the Markov 
algorithms defining their functional composition, the number 
of resulting substitution rules can be enormous. For example, 
for 2 Markov algorithms over an object alphabet consisting of 
all English letters, 1,U5T substitution rules (Section 3.3, 
ref. 9) must be added to the algorithms to produce the algo- 
rithm representing their functional composition. Although 
by using the extension to Markov algorithms the number of 
additional rules could be reduced to 7, an algorithm composed 
by several functional compositions would quickly require many 
substitution rules and would be correspondingly difficult to 
understand. 

17 1 ft 

On the other hand, Church's X-calculus, ' a formalism 
that makes precise the notion of a function and its properties, 
is ideally suited to handle the concept of functional composi- 
tion. The next section presents the formalism of the X- 
calculus, and the subsequent section discusses the embedding 
of the formalism of extended Markov algorithms within the 
formalism of the X-calculus. This combined formalism 
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will provide the heart of this dissertation's target lan- 
guage for defining semantics. 

3.1b The X-Calculus» 

The X-calculus is a formalism for writing certain classes 
of expressions. One interpretation (the interpretation taken 
here) of the formalism is as an explication of ideas about 
the specification and application of functions. Let C and 
V be disjoint sets of symbols, not including the symbols 
{* • ( ) 0>, where "o" denotes a string of one or more blank 
spaces. The set C will be called the set of constants. The 
set V will be called the set of variables. A well-formed 
expression in the X-calculus is any string defined (recursive- 
ly) by the following rules: 

(a) If p is a variable, or p is a constant, then p is 
a well-formed expression. 

(b) If E and F are well-formed expressions, then (E P) 
is a well-formed expression. 

(c) If v is a variable and E is a well-formed expres- 
sion, then Xv.E is a well-formed expression. 

For example, if C comprises the symbols (3 SQ} and V comprises 

the symbol {X}, some example expressions are "3", "(SQ 3)" 

and "XX. (SQ X)". An expression of the form (E F) is called 

a combination, and the expressions E and F in (E F) are called 

respectively the operator and operand of the combination. An 

expression of the form Xv.E is called a X-expression, and the 



•The terminology in this chapter is due mostly to Church and 
Landin. 
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expression E in Xv.E is called the body of the X-expression. 
Here, a X-expression of the form Xv.E will he interpreted as 
a representation of the function mapping the variable v into 
the expression E. 

An occurrence of a variable in a well-formed expression 
is distinguished as "free" or "bound" according to the fol- 
lowing rules: 

(a) If E is an expression consisting only of a variable, 
the occurrence of the variable in E is free. 

(b) If E and F are expressions, an occurrence of a 
variable in (E F) is free or bound according as it 
is free or bound in E or F. 

(c) If v is a variable and E is an expression, all oc- 
currences of v in Xv.E are bound while an occurrence 
of a variable different from v in Xv.E is free or 
bound according as it is free or bound in E. 

For example, in the expression "XX. (F X)", where "F" and "X" 
are variables, the occurrence of "F" is free and the occur- 
rences of "X" are bound. 

Church introduces rules for transforming expressions. 
Using these rules, some expressions can be transformed into 
a "principal normal form." The principal normal form of an 
expression may be viewed as a "canonical" or standard repre- 
sentation of the value of the expression. Because of the 
introduction of assignment and goto expressions into the 
target language to be presented later, the rules for trans- 
forming a target language expression into normal form will 
not always hold. Instead, the value of a target language 
expression will be defined in this dissertation by an 
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extended Markov algorithm specification of a machine that 
mechanically converts an expression into a canonical repre- 
sentation of the value of the expression. 

This machine will be defined formally in section 2 of 
this chapter. The operation of this machine for evaluating 
A-calculus expressions will be presented informally in this 
section . 

In general, the value of a constant or free variable is 
the object denoted by the constant or variable. A list of 
the values of the constants and free variables is called an 
"environment." The value of a A-expression is called a 
"A-closure" and consists of two parts: (a) the expression 
itself, and (b) the environment in which the A-expression 
occurs, i.e., the list of the values of the constants and 
free variables in the expression. 

The value of a combination is the object computed by 
evaluating its operand, evaluating its operator (using the 
values of constants and free variables given by the environ- 
ment of the combination), and then applying the value of the 
operator to the value of the operand. If the operator of a 
combination is a A-expression, the result of applying the 
A-expression to its operand is computed by (a) coupling the 
bound variable of the A-expression with the value of the 
operand to which the A-expression is being applied (b) add- 
ing this couple to the environment of the A-expression, and 
(c) evaluating the body of the A-expression using this new 
environment . 
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Some example X-calculus expression are the following: 

3 XX. 3 (XX. 3 2) 

(SQ 3) XX. (SQ X) (XX. (SQ X) 3) 

X XX. X (XX. X 3) 

If "2" "3" and "SQ" are constants denoting respectively the 

integer two, the integer three, and the function mapping an 

above 
integer into its square, the nine expressions /denote 

the integer the function mapping X the integer 
three into the integer three three 

the integer the function mapping X the integer 
nine (presumably one integer) nine 

into its square 

some object the identity function the integer 



X 



three 



3.1c The Marriage of Extended Markov Algorith ms to the 
X-Calculus . 

This section combines the formalism of extended Markov 
algorithms within the formalism of the X-calculus. The wedding 
of these two formalisms will form the basis for the target 
language that will be presented in Section 3. Id. 

Let E be a set of strings representing extended Markov 
algorithms, where the charact ers {[ , ] , | , and "} do not occur in 
E. Let L be another set of strings, called the set of 
literals, where the character ' does not occur in L. Let C 
be a set of basic symbols, called the set of constants, where 



63 



each constant is either a string from E enclosed by the 
brackets [ and ] or a string from L enclosed by the quotation 
marks ' and ». Let V be another set of basic symbols, cabled 
the set of variables, where each variable contains no occur- 
rence of{[, ], or'}. (Thus the sets C and V are disjoint.) 
An expression in the combined formalism will consist of any 
expression M such each occurrence of a variable in M i s bound 
in M. 

The extended Markov algorithms will be interpreted as 
definitions of primitive functions, the literals will be 
interpreted as representations of the objects upon which the 
primitive functions operate, and the variables will be inter- 
preted as names of primitive functions, literals, or functions 
of the primitive functions and literals. In the examples in 
the text, the quotation marks will often be omitted from 
constants that represent integers. 

Expressions in the X-ealculus are strings of basic 
symbols, and hence to include an extended Markov algorithm 
in the X-calculus, it is necessary to have a linear repre- 
sentation of an extended Markov algorithm. An extended 
Markov algorithm of the form x 



D 



S l ^ ( - } *i 
s 2 T (.) t 



S n + (>) *» 
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where X is the statement listing the string variables in the 
algorithm, and D is the definition of the sets named in X, 
will therefore he represented 

[X D S;L >(•) t 1 | s 2 +(•) t 2 | ... | s n +(•) t Q ] 

For convenience, however, the statement X and the definition 
D will generally he given separately from the list of sub- 
stitution rules in the algorithm. For example, consider the 
following expression: 

Xa.([B->-D|C->-F|o->-l] a) 

This expression can be used in combination with other expres- 
sions to transform strings. For example the expression 

(Xa.([B-fD|C*F|0*l] a) 'COBBLER') 
successively takes on the values 

( [B-»-D|C->-F|0-*l] 'COBBLER') 
and finally 

FIDDLER 

In defining the semantics of computer languages, it 
will be convenient to consider the symbols {-»■ • A [ ] | } as 
object alphabet symbols in an extended Markov algorithm. I 
therefore adopt the convention* that any string (not includ- 
ing the symbol ") enclosed by the quotation marks " and 
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in an extended Markov algorithm is to be considered as an 
object alphabet string. This use of quotation marks allows 
us to consider extended Markov algorithms whose object 
strings are themselves extended Markov algorithms. This 
point will be discussed in the definition of the primitive 
function "CAT", to be presented shortly. 

The basic notation for the combined formalism is not 
especially suited to digestion by humans. To make the nota- 
tion more palatable, I will introduce a series of alternate 
notations for writing expressions in the combined formalism. 
The alternate notations will be given for convenience and 
conciseness in communicating the expressions to humans. The 
alternate notations for the A-ealculus, and the x-calculus 
definitions for conditional expressions and recursive func- 
tions are for the most part due to Landin. 

Alternate Notations for Extended Markov Algorithms: 

The linear representation of an extended Markov algorithm 
is difficult to visualize. Accordingly, I will generally use 
the notation 



'1 
5 2 



(•) 
(•) 



(•) t. 



(where the variable and set definitions for the algorithm 
will be given separately) in place of the strict linear 
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representation of an extended Markov algorithm in the X- 
calculus. For example, the expression 

Xa.([B-»-D|C-»-F|0-*l] a) 

will be written 



Xa. ( 



B ■+• D 
C -+ F 

■*■ I 



a) 



The Function CAT: 

Let s be a string variable representing any string of 
characters and consider the following expression 

Xa.([s. - M [A— " s '"]"] a) 

This expression defines a function mapping the value of the 
variable a into the extended Markov algorithm [A -»•• a], 
where "a" here denotes the value of the variable a. This 
extended Markov algorithm when applied to an object string 
concatenates the string value of a to the object string. The 
function above will be called "CAT". For example, the expres- 
sion ((CAT 'HELLO') ' THERE') successively takes' on the 
values : 
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(Ua.([s. +• "[A+«" s" ]" ] a) 'HELLO ' ) 'THERE') 
((Is. ■*• "[A-.-." s "]" ] 'HELLO •) 'THERE') 
([A ■*•• HELLO ] 'THERE') 
HELLO THERE 

Similarly, the expression ((CAT ((CAT 'HOW ») 'ARE ')) 'YOU') 
takes on the value "HOW ARE YOU". Note that the extended 
Markov algorithm [s. - "[A-" s "]" ] maps its object string 
into another extended Markov algorithm, and thus extended 
Markov algorithms have the ability to define Junctionals , 
i.e., functions mapping an argument into a nev function. 

In defining the semantics of a computer language, it 
will frequently be necessary to concatenate strings to pro- 
duce a string that represents an extended Markov algorithm 
or a string to which an extended Markov algorithm is applied. 
It will be convenient not to state explicitly the concatena- 
tion of strings in these cases, and I therefore introduce 
the following alternate solution. 

Let "CAT" be the function as defined above, 
let x ii 1 — i<n be expressions, and 
let UfcAT.7.( (CAT ((.:CAT X-^ Xg)) XO) ... x ) be 
an expression whose value is an extended Markov 
algorithm or a string to which an extended 
Markov algorithm is applied. The X ± can be 
written directly in the form of the extended 
Markov algorithm or the concatenated string to 
which an extended Markov algorithm is applied. 

Thus, for example, the expressions 
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Xir.Xa.X(S.(((CAT((CAT((CAT((CAT « [TRUE - • ) a)) ' FALSE - • )) 

0)) ']*) *) 



Xa.Xe.( [TRUE/TRUE ■— TRUE 
FALSE/TRUE -»■• FALSE 



TRUE/ FALSE ■+• FALSE 
FALSE/FALSE -»■• FALSE] 



((CAT ((CAT a) '/')) 6)) 



can be written 



Xir.Xa.XB.([TRUE ■*' a | FALSE +• p] it) 



Xct. X p. ( [TRUE/TRUE -»■• TRUE 
FALSE/TRUE ■*- FALSE 



TRUE/FALSE -*■• FALSE | 
FALSE/FALSE ->■• FALSE] a/$) 



or further rewritten using the previously given alternate 
notation 



. , fl TtRUE ■+• ol 
iir.Xo.XP.^ FALSE _„. gj* 



Xo.Xp. ( 



TRUE/ TRUE -*•• TRUE 

TRUE/FALSE -»•• FALSE 

FALSE/TRUE -►• FALSE 

FALSE/FALSE + • FALSE 



i/B) 



The first expression defines a function* that when successively 



•Greek letters will generally not occur as object strings for 
extended Markov algorithms. I will therefore use Greek 
letters in an extended Markov algorithm or the string to 
which it is applied to denote the symbols that are bound 
variables. Thus, in writing the strict representation^ 
the algorithm or its object string in terms of X-calculus 
expressions, strings not containing Greek letters are to 
be quoted and the Greek letters are not to be quoted. 
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applied to three arguments produces the value of the variable 
a if the value of the variable ir is "TRUE" and produces the 
value of the variable g if the value of the variable tt is 

FALSE". The second expression defines a boolean-valued 
function that when successively applied to two boolean valued 
arguments produces the value "TRUE" if both arguments have 
the value "TRUE" and produces the value "FALSE" if either 
argument has the value "FALSE". The first expression will 
later be used to define conditional expressions. The second 
expression will later be used to define the function for pro- 
ducing the logical "and" of two arguments. 

Note that the first expression above constructs an 
extended Markov algorithm from literal strings and bound 
variables. The notion of a bound variable lends itself im- 
mediately to extended Markov algorithms embedded within the 
A-calculus and allows the construction of extended Markov 
algorithms that depend on the values of the variables to 
which the algorithms are applied. This compatibility be- 
tween the married formalisms greatly simplified the defini- 
tions of the -primitive functions for SNOBOL/l and ALGOL/60. 

Alternate notations for the X-calculus: 

The basic notation for defining and applying functions 
in the X-calculus is somewhat awkward for those accustomed 
to writing functions in the conventional mathematical nota- 
tion. I thus introduce the following alternate notations. 
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Let F, V x , V 2 , 



V be variables and M, Q, E , E , 



E be expressions. Expressions of the form 
" " * ' n 

(a) (XV 1 .(XV 2 ...(XV n .M E n ) ... E g ) E 1 ) 

(b) (XF.M AV 1 . AV 2 . . .XV n .Q) 

(c) (...((F E 1 ) E 2 ) ... E n ) 

can be written 



(a) LET V ,V , 
IN M 



(b) LET F(V V 
IN M 



. , V n = El ,E 2 , 



. , V ) = Q 

' n 



>) F(E 1 ,E 2 , ... , E n 



wh 



ere if M.Q.E^Eg, 



or E are enclosed in parentheses 
' n 

the parentheses can be dropped. Thus, for example, the 
expres s i ons 



(XX. ('SQ' X) 3) 



((XX. XY. (( 'CAT' X) Y) 'HELLO ') 'THERE') 

r, JtRUE "►•cTI „ 

(XCOND. (( (COND 'TRUE') 0) l) Xtt . Xa . XB . ( FALSE „ . 3 J v 



)) 



can be written 



71 



1 1 1- 



LET X = 3 

IN 'SQ» X 



LET X,Y= 'HELLO ', 'THERE' 
IN (('CAT' X) r) 

LET CONDU.a.e) = ( F™ E 
IN COND (' TRUE »,0,l)l: ALSE 



-. ej '■ 



Conditional Expressions: 

Consider the function COND defined previously 



«»»«'....) -([SSS,::;]. 



This function selects the value of a if the value of ir is 
"TRUE" and the value of B if the value of ir is "FALSE". For 
example, the value of COND( 'TRUE' ,0,1) is the string "0". 
Next consider the following expression from ALGOL/60 

IF A=0 THEN B»A ELSE B/A 

and the (loosely written) expression in the combined formal- 
ism 

C0ND(A=0,B#A,B/A) 

where COND is defined as above. This expression does not 
correctly mirror the ALGOL/60 expression. In ALGOL/60 the 
expression B»A is evaluated only if the value of A is equal 
to zero, and the expression B/A is evaluated only if the 
value of A is not equal to zero. This order of evaluation 
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insures that B/A is not evaluated if the value of A is zero. 
Now consider the following (loosely written) target language 
expression 

{COND(A=0,Xir.B»A,Xir.B/A) 'A') 

where i is a dummy variable. In evaluating this expression, 
the function COHD will he applied to its arguments, one of 
the X-expressions Xir.B»A or Air. B/A, will he selected and then 
the selected X-expression will he applied to the operand 'A*. 
Thus only the body of the selected X-expression will he 
evaluated.* The use of the dummy variable serves as a delaying 
mechanism in evaluating expressions. 

Conditional expressions of the above form will be used 
repeatedly in defining the semantics of computer languages. 
I therefore introduce the following alternate notation. 

Let s , s 2 , t 1 , t 2 , and t 3 be expressions. Expressions 
of the form 

(C0ND(s 1 ,Xir.t 1 ,XTr.t 2 ) 'A') 

and 

(C0ND(s 1 ,Xn.t 1 ,Xir.(C0ND(s 2 ,Xtr.t 2 ,Xii.t 3 ) 'A'}) 'A') 

can he written 



s l =» *l 
ELSE => t 2 



•Note, in forming a X-closure, the body of the X-expression is 
not evaluated. 
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and 

8 =» t 

ELSE ■=> t* 

Similarly, this alternate notation can be extended to include 
an arbitrary number of nested conditional expressions. 
For example, the expression 

(COHD(A*0,Xir.B«A,Air.B/A) »A») 
can be written 



A=0 => B»A 
ELSE =>■ B/A 



3. Id The Target Language 

The combined formalism of extended Markov algorithms and 
the A-calculus presented in the previous section appears suf- 
ficient to define fairly concisely many constructions in 
computer languages. However, two common features of many 
computer languages, that for assigning new values to variables 
and that for transferring control to another statement in a 
program, have evaded characterization in the combined formalism. 
To handle this circumstance, the combined formalism will be 
augmented with new expressions to mirror directly the assign- 
ment of new values to variables and the transfer of evaluation 
from one expression to another. The augmented version of the 
combined formalism will comprise the target language of this 
dissertation. 
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Sequences of Expressions: 

Before discussing the rules for forming veil-formed 
expressions in the target language, let us consider a mechan- 
ism for defining a sequence of expressions, where each expres- 
sion El ,E 2 , ... .I n in the sequence is to he evaluated in the 
numerical order indicated by its numerical subscript. Using 
the rule for evaluating the operand of a combination before 
the operator of a combination, the target language provides 
a device for handling a Sequence of expressions. 

Let X,E,E 1 ,E 2 and E n be expressions, and consider 

the following X -express! on, called f 

Xa.Xg#(B o) 

*hen evaluated, the combination (* *) results in first evalu- 
ating the expression E and then retarding the value of the 
X-closure for XB.U «), »»ere a i* coupled with the value of 
E. Next consider the combination 

[(T E) Xt.t) 

where square brackets have been used here (for convenience) 
in- place of parentheses.* This combination is evaluated as 

follows : 

1. The X*closure for Xtt.X is computed 



'Square brackets Will be used frequently in this section. 
S?rtctly s*eaki«g, .11 sa«a*e brackets should be replaced 
by parentheses. 
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2. The combination (T E) is computed, resulting in 
first evaluating E and then returning the A-closure 
for AB.(6 a), where a is coupled with the value of 
E. 

3. The value of the expression in 2 is applied to the 
value of the expression in 1, resulting in applying 
Air.X to E, which returns the value of X. 

In particular, if X is the expression * V , this combination 

results in returning the value of E. 

Next consider the expression 

[(T E ± ) Att.[(t E g ) An. it]] 

This combination is evaluated as follows: 

1. The A-closure for Att.[(T E ) An.*] is computed. 
Note that the value of E Is not computed in forming 
the A-closure. 

2. The combination (T E 1 ) is computed, resulting in 
first evaluating E and then returning the A-closure 
for Ag.($ a) x 

3. The value of the expression in 2 is applied to the 
value of the expression in 1, resulting in return- 
ing the value of [(T E ) Att.it]. This evaluation 
results in first computing the value of E and then 
returning the value of E . d 

Thus the evaluation of this expression results in first 
evaluating E 1 , then evaluating E 2 , and finally returning the 
value of E_. 

Similarly, consider the expression 

[(T E ) Att.[(T E.) Att.[(T E ) Xtt.it] ] ] 

+ t • t- 3 



When evaluated, this expression results in successively 
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evaluating E^ Eg, and E 3 and then returning the value of Eg. 
This expression, however, has the following important property, 
which will he used in the definition of the transfer of con- 
trol to some labeled expression in a sequence of expressions. 
Let C , C , and C be the combinations that are given by the 
matching paris of square brackets indicated by the numbers 
1, 2, and 3 above. The evaluation of C 1 results in succes- 
sively evaluating E^ Eg, and E 3 and returning the value of 
E ; the evaluation of Cg results in successively evaluating 
E 2 and E 3 and returning the value of E^; the evaluation of Cg 
results in evaluating E 3 and returning the value of Eg. 
More generally, an expression of the form 

[(T E n ) Xtt.[(T E_) ... Xir.[(T E ) Air.*]...]] 
+ 1 + . . + 

1 2 n 

when evaluated, results in successively evaluating E.^ Eg, 

... , and E and returning the value of E . Moreover, the 
' n " 

evaluation of any combination C i beginning with the square 
bracket denoted by the integer i results in successively 

iluating the expressions E i$ E i+1 , .... and E n and return- 
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ing the value of E n - This later effect leads us to the notion 
of a "labeled" expression. 



Labels and Label References: 

Let V be the set of variables (as described earlier) and 
let L be the set obtained from V by affixing a " : " to each 
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variable in V. The set L will be ealled the set of labels. 
Consider an expression of the form 

^[(T E 1 ) Xw.t 2 [(T E 2 ) ... A».* n I(T E Q ) An.*]...]] 

where the l ±t l<i<n indicates the possible occurrences of 
labels, each of which must be different. An expression of 
this form will be called a "sequence" of the expressions E , 
E 2 , ..., and E q . If we ignore the labels in an evaluation, 
the evaluation of any combination C ± following some label 
*i» ii 1 ! 11 . results in successively evaluating E , E 1+1 , ..« , 
and E q and returning the value of E . 

A sequence of the above form may occur within the body 
of some X-expression, which in turn may occur within a se- 
quence in the body of some encompassing A-expression, and so 
on for further encompassing A-expressions. In the target 
language the transfer of control to some labeled expression 
will be designated by expressions of the form (GOTO. E), 
where E is an expression referring to some label. A label 
reference will be a string of the form .1 , where t: is a 
label. The value of a label reference .£ will consist of 
two parts: (a) the combination in the innermost encompassing 
A-expression such that the combination is prefixed by the 
label I: , and (b) the environment within which the combina- 
tion is to be evaluated. The evaluation of a label reference 
will be called a "label-closure". 

I now proceed to a presentation of the target language of 
the dissertation. 
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Target Language Expressions: 

An expression in the target language is defined as 
follows. Let C, V, and L be sets of symbols, called the sets 
of constants, variables, and labels* a* described earlier. 

(a) If p is a variable or p is a constant, then p is 
an expression. 

(b) If E and F are expressions, then (E F) is an 
expression. 

(c) If v is a variable and E is an expression, then 
Xv.E is an expression. 

(d) If v is a variable and E is an expression, then 
(v ASSIGN. E) is an expression. 

(e) If S is a sequence, then S is an expression. 

(f) If E is an expression, then (GOTO. E) is an expres- 
sion. 

Expressions of type (a), (b), and (c) are expressions in the 
combined formalism as introduced previously. Expressions of 
type (d), (e), and (f) are new. The evaluation of an expres- 
sion of the form (v ASSIGN. E) will result in first changing 
the value of the variable v to the value of the expression E 
and then returning the null string :as the value of the 
expression (v ASSIGN. E). If the labels in an expression of 
type (e) are ignored, the evaluation of a sequence results in 
successively evaluating each of the component expressions E^ 
E , and E in the sequence and returning the value E q . If E 
is an expression of the form .1 , where I: is a label, the 
evaluation of E will result in formings the label-closure for 
.*, and the evaluation of an expression of the form (GOTO. E) 
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within some sequence will result in (a) stopping the evalua- 
tion of the expression in which E occurs and (b) continuing 
by evaluating the combination designated by the label-closure 
for . SL within the environment specified by the label-closure. 
Note that this mechanism allows transfer of control only to 
expressions within the same sequence or expressions in a 
sequence in some encompassing A-expression. The previously 
given notation for defining a sequence of expressions is 
awkward. I thus introduce the following alternate notation 
in place of the strict representation of a sequence. Let E 
be a sequence of the form 

i 1 [(T E 1 ) Xir.A 2 [(T Eg) ... Air.* n [(T E ) Xir.ir]...]] 

where the i^, 1<± <n , indicate the possible occurrences of 
labels. A sequence of this form will be alternately written 

*lV V 2 » ••• Vn 

The addition of expressions of type (d), (e), and (f) 
take effect when it is desired to construct a sequence of 
expressions to be evaluated one after another or to interrupt 
the evaluation of a sequence and to continue the evaluation 
at some other labeled expression. 

For example, consider the expression 

LET A= 5 

IN (A ASSIGN. (+(A,1))); 

(GOTO. .P); 

(A ASSIGN. 1); 
P:A 



80 



where "+" is a free variable whose value is the function for 
computing the arithmetic sum of two integers. The evaluation 
of this expression is as follows: 

(1) The value of the bound variable A will be set to 
five and the body of the X-expression evaluated. 

(2) Since the body of the X-expression is a sequence 
of expressions, each of the component expressions 
will be evaluated in order. 

(3) The first expression in the sequence results in 
updating the value of A to six. 

(k) The second expression results in transferring the 
evaluation to the expression labeled P. 

(5) The evaluation of the expression labeled P results 
in returning the value of A, which has been set to 

six. 



Recursive Definitions: 

Consider the following (loosely written) expression 
defining the factorial function and its application to the 
integer five: 

LET FACT(N) = EQ(N.O) =^ 

ELSE =^ N*FACT(N-1) 

IN FACT(5) 

where EQ is a boolean valued function for testing the equality 
of two integers. The function "FACT" when applied to the argu- 
ment "5" will not evaluate to five factorial. The difficulty 
here arises in the definition of the function "FACT" where 
the variable "FACT" itself occurs as a free variable. This 
incorrect rendering of a recursive function can be corrected 
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through the notion of a "fixed-point operator." 20 ' 25 One 
fixed-point operator for target language expressions is the 
expression 

Y = XF. LET ir»»A' 

IN U ASSIGN. (F ir)) ; * 

If M is an expression and F=E is a recursive definition of the 
function F, an expression of the form 



LET F » E 
IN M 



where E contains free occurrences of the variable F, can be 
correctly written 



LET F » (Y XF.E) 
IN M 



To avoid this somewhat awkward method for writing recursive 
functions, the following alternate notation is introduced. 

If F is a variable and E and M are expressions, an 
expression of the form 

LET F = (Y AF.E) IN M 

where Y is the fixed-point operator given above, can 
alternately be written 

LET REC F=E IN M 
Thus the definition of the factorial function can be correctly 
written 



LET REC FACT(H) - EQ(N,0) .=£ 

ELSE =£ NaFACT(N-l) 
IN FACT(5) 
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^■^■i^ti^tr^^-*: 



The above fixed-point • operator is sufficient to handle 
recursive definitions of single functions hut not simultaneous 
recursive definition of two or more functions. In this dis- 
sertation simultaneous recursive definitions will not he 
needed until the semantics of ALGOL/60 procedure declarations 
is defined, and the presentation of a fixed-point operator to 
handle simultaneous recursive definitions will he deferred 

until the chapter on ALGOL/60. A detailed discussion of 

25 
fixed-point operators is given hy Wozencraft. 

A Definition of the Semantics of the ALGOL/60 Subset: 

The definition of the semantics of the ALGOL/60 subset in 
terms of the target language is given in Appendices 2.1 and 
2.2. The specification of the corresponding target language 
expression for a program in the subset has been broken into 
two parts. Appendix 2.1 defines the translation of a program 
into the target language assuming that the primitive "+ n is a 
free variable. Appendix 2.2 defines the primitive "+". To 
form the complete target language expression, one must take 
the target language string specified in Appendix 2.1 and add 
to it the primitive function definitions of Appendix 2.2 in 
the form 

LET CAT o*ts. -►• " [h-'" s "]" ]o 

IN LET EQ(o,B) - ... (a) 

IN LET REC +(X,Y) « EQ(Y,0)^0 ELSE =* SUM(SUCC X.PRED X) 
IN LET d' IN s' 
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where "LET d' IN s"' is the target language string specified 
by Appendix 2.1.* For example, Appendix 2.1 specifies the 
following pair of strings 

BEGIN INTEGER A; A:=l+2 END .. LET A = 'A» 

IN (A ASSIGN. (+( , 1','2»)) 

The string "LET A = 'A' IN (A ASSIGN. ( + ( '1 • , ' 2 • ) ) " when used 
in place of "LET d' IN s » " in expression (a) above specifies 
the complete target language expression for the program 
"BEGIN INTEGER A; A:=l+2 END".»» 

3.2 An Evaluator for the Target Language 

To explain the semantics of the target language in the 
previous sections, an appeal was made through the English lan- 
guage. This section reduces that appeal to an appeal for 
understanding only the formalism of extended Markov algorithms. 



•This division of the specification of the semantics of a 
computer language into a specification of a target language 
string and a separate specification of the primitive functions 
used in the target language string will he followed in the 
definitions of SN0B0L/1 and ALGOL/60. Also, the definitions 
of the string variables for the extended Markov algorithm 
primitives are given at the beginning of Appendix 2.2. These 
definitions must be added to each extended Markov algorithm 
using the string variables. 

••It may happen that the use of identifiers in a source language 
program will conflict with the use of identifiers used to de- 
fine the primitive functions in the target language. To avoid 
this conflict, the identifiers for the target language primi- 
tives strictly speaking should be given as identifiers that 
are different from the source language identifiers. This con- 
flict can be avoided by appending to each target language 
identifier a symbol (e.g., the symbol "#") not allowed in 
source language identifiers. 
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The "value" of a target language expression will be defined" 
in this section by an extended Markov algorithm definition of 
a machine that mechanically converts an expression into another 
expression, the value of the initial expression. The machine 
may he viewed as a hypothetical computer for the target lan- 
guage, and extended Markov algorithms may he viewed as the 
machine language for the computer. The definition of the 

target language evaluator is "based on a similar definition 

20 2k 25 

given by Landin, ' and Wozencraft. 

The extended Markov algorithm definition of the target 
language evaluator is given in Appendix 2.3. Before applying 
the algorithm to a target language expression, it is neces- 
sary to provide a unique index for each "x M and "(" in the 
expression. Thus the expression 

(XX.CSQ' X) '3') 

will be indexed 

( 1 X 2 X.( 3 'SQ« X) '3') 

The indices allow unique identification of a X-expression 
or combination. 

The evaluation of an expression begins with a substitu- 
tion rule transforming the expression to be evaluated into 
five strings: the "control" string, the "result" string, 
the "environment" string, the "store" string, and the expres- 
sion itself. Subsequent substitution rules define transforma- 
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tioos on the control, result, environment, and store strings 
until the value of the target language expression is computed. 
The final substitution rule returns the value of the expres- 
sion. 

Generally, the control string is a string of the form 

a k a k-l ••' a l 

where each a.^ , l<i<k, is an atomic part of an expression 
(e.g., a constant, variable, indexed lambda symbol, or indexed 
left parenthesis). The control string is used to hold the 
atomic parts of an expression before they are evaluated. 

When the parts of the control string are evaluated, their 
values are placed on the store string. The store string is a 
string of the form 

(111...1, r n ) ... (lll,r 3 )(ll,r 2 )(l,r 1 ) 

where each r^, l^.i^n, is a string denoting the value of a 
constant, a variable, or a A-expression, and the string of 
ones before each string value provides a unique pointer to 
the string value. A new store component for a string r n is 
obtained by (a) obtaining the string of ones representing the 
pointer p to r Q and (bj prefixing the string "(lp,r )" to 
the left of the store string. 

The result string is used to store pointers to inter- 
mediate calculated values formed in the evaluation of a target 
language expression. The result string is a string of the form 



86 



p m ••' P 2 P l 

where each p. , i<l<m, is a pointer to some string value in 
the store. 

Let H ,M 1 ,K 2 ,M 2 , ... .N k ,M fc denote strings of ones, let 
v.,v 2 , ... , v. denote variables, and let p^p,,, ... »P k 
denote pointers to the store. The environment string is a 
string of the form 

( V M k VV '•• ( V" M 2 T 2-»2 ,(, l**l V 1 =P 1 } 

where each component (N *M. v i = P i ) is a string such that N^, 
l<i<k, identifies the environment for some X-expression X., 
v. identifies the bound variable v of A., p. is a store 
pointer to the current value v, and M. identifies the environ- 
ment of the encompassing X-expression. The environment M. is 
said to be "linked" to the environment N.. In general, the 
environment components linked to N. provide pointers to the 
current values each of the bound variables in the X-expres- 
sion X and its encompassing X-expressions . The list of 
environment components linked to N will be called the 
environment N . For example, consider the environment "lllll" 
in the environment 



(11111*11 X-llllll) (1111*11 A=ll)( 111*11 B=lll)(ll*l Y=lll)(l*l Z=l) 

The environment components linked to "lllll" provide store 
pointers to the current values of the variables X,Y, and Z in 
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the X-expression whose environment is identified by "lllll". 

A new component is prefixed to the environment string 
each time a new X-expression is applied. Thus each N at the 
left of each environment component identifies an environment 
for some applied X-expression, and the environment components 
linked to N provide pointers to the values of the free vari- 
ables in the body of the X-expression whose environment is 
given by N . Since constants in the target language are 
treated as literal strings whose values are the strings them- 
selves, the values of the constants in an expression are not 
placed on the environment string. 

The set definitions for the string variables used in 
the extended Markov algorithm definition of the evaluator are 
given in Appendix 2.3a. The set "STR" defines the set of all 
strings that might occur within a target language expression. 
The sets "CONSTANT" and "VARIABLE" define the sets of con- 
stants and variables. The sets "PTR" and "INDEX" define 
respectively the set of pointers to the store string and the 
set of indices used in marking an expression. The set "EXP" 
defines the set of target language expressions, the set "EXP HD" 
defines the set of strings that can occur at the head of an 
expression, and the set "EXP TL" defines the set of strings 
that can occur at the tail of an expression. For example, in 
the expression " (^X. ( 3 'SQ» X) '3')" the string "( " is the 
head of the expression and the string "X X.( 'SQ' X) '3')" is 
the tail of the expression, and in the expression "X" the 
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variable "X" is the head of the expression and the tail of 
the expression is null. 

The substitution rules for the extended Markov algorithm 
definition of the target language evaluator are given in 
Appendix 2.3b. Three alternate notations were used in writing 
these rules: 



(1) Let 



Ki<5, be string variables repre- 



Let x. ana y., -mi.!? t »e spring varinuien «f» 
senting arbitrary strings used in an extended 
Markov algorithm. Generally, each substitution 
rule is of the form* 

:c yi -x 2 ry 2 -x 3 ey 3 -x u sy u -x 5 py 5 > + ^'y^r 'Vq-^'V^U 3 '^"V 'V 

where the c, r, e, s, and p are string referring to 
portions of the control, result, environment, store, 
and expression strings and the c', r', e', s', and 
p* are the transformed portions of these strings. 
Since the x. and y. occur in each substitution rule, 
a substitution rule of the above form will be written 
in the form 



c 




C 


r 




r' 


e 


-»• 


e' 


s 




s« 


P 




P' 



(2) If one of the five strings c, r, e, s, or p is given 
as null on both sides of the substitution rule, the 
symbol " M can be used in place of the null string sym- 
bol M A". 

(3) If one of the five components c, r, e, s, or p occurs 
unchanged in the right-hand side of the substitution 
rule, the symbol "I" can be used in place of the 
string in the right-hand side of the rule. 



•The hyphen "-" is used to separate the control, result, en- 
vironment, store, and expression strings. 
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Thus the substitution rule 



< ^i y l~ x 2 Ay 2" x 3 Ay 3" x U Ay l»'" :]c 5^i ht h,t ' ^5* 

■*■ <h' h APPLY. y 1 -x 2 Ay 2 -x 3 Ay 3 -x^Ay^-x 5 ( i ht h't')y 5 > 

can be written using notation (l) 



A 

A 

( ht h't' ) 



b' h APPLY. 

A 

A 

A 

( i ht h't') 



and further written using notations (2) and (3) 



<1 



^ht h't' ) 



h' h APPLY. 



Three exanple evaluations of target language expressions 
are given on the adjacent pages. Each of these evaluations 
shows the successive transformations on one of the initial 
expressions :* 



('SQ' '3') LET X='3' LET X«'3' 

IS <»SQ' X) II (X A6SIGB. 'U'); 

(GOTO. .L); 
(X ASSIGN. '5' ); 
L: X 



•The constant 'SQ' in the first two expressions represents 
the primitive function for squaring an integer. Strictly 
speaking, all primitive functions in the target language 
must be defined by constants that are extended Markov algo- 
rithms. 
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Initialization and Termination of Evaluation (rules 1 and 12)* 

The evaluation of an expression begins (rule l) by- 
initializing the control string with the head of the expres- 
sion to be evaluated and the marker "L", initializing the 
result string with the marker "| "> initializing the environ- 
ment string with the string "(l-<-l ir=l)", initializing the 
store string with the string "(l,A)", and initializing the 
expression string with the expression to be evaluated. Since 
the initial environment will generally contain the values of 
no free variables, the initial environment string contains 
the dummy variable it whose value is a pointer to the null 
string in the store. The marker "|" is placed on the control 
and result string to denote that the head of the expression is 
to be evaluated within the initial environment 1. In general, 
the subscript J of the leftmost |. in the control string de- 
notes that the control string variables to the left of the 
|, are to be evaluated using the environment j, i.e., using 
the environment components linked to the component 
(N^M. v.=p.) where N.=j. 

The evaluation terminates (rule 12) when the control 
string is null. When the control string is null, the result 



*Rules 1 and 12 do not exactly follow the alternate notation 
for the evaluator given earlier. These rules are strictly 
given as 

"*" <h l i~li -(*i-*-* i 7r=l)-(l,A)-ht> 



ht 



<A-p-x 3 y 3 -x lt (p, r ) yii -x 5 y 5 > ■+- 
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string vill contain s pointer to soae string value in the 
store. The string in the store is returned as the result of 
the evaluation. In general, the result of an evaluation is 
either a constant or a X-closure. Strictly speaking, if 
the result of the evaluation is a X-closure, the X-expression 
and the values of its free variables should he returned as the 
result of the evaluation. If the result of the evaluation is 
a X-closure, the X-expression and the values of its free 
variables can be obtained fir da the environment , store, and 
expression strings specified prior to the termination of 
evaluation. 

If a user were evaluating target language expressions 
with input-output facilities ^ (a) the initial values of the 
input and output strings ^presumably those given on some 
device like a teletype or card reader) could be placed in 
the initial store string and (b) two system variables and 
pointers to their initial values could be placed on the 
initial environment string. The addition or removal of 
strings on the input or ai-tput device could then be defined 
by updating the values of' the system variables to -Ghrfir pew 
values. This is the mechanism used to define inpu4-output 
in S»b»OL/l (see Chapter 1*V).< 

Evaluation of Combination* (yule 2): 

If a leTft; parenthesis of a combination is at the left 
of the* Control string, the left- parenthesis is remove* from 
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the control string,* and tbe head of Its operand and operator 
are prefixed to the control string and the string "APPLY." is 
placed to the right of these two strings. Subsequent rules 
will evaluate the operand and operator, and then apply the 
value of the operator to the value of the operand to produce 
the value of the combination. 

Svaluation and Application of A-expressions (rules 3, 8, and ll)i 

If the name A, of a A-expression is at the left of the 
control string (rule 3)* the current environment J (initially 
the dummy environment l) is obtained, the string "A.e." is 
placed in a nev component at the left of the store string, 
and a pointer to the new store component is prefixed to the 
result string. The string "^j 6 *" represents the A-closure 
for A. in that (a) A. provides a name uniquely Identifying 
the A-expression A. contained in the expression string and 
(b) the environment component J provides the (linked) list of 
the pointers to the current values of the free variables of 
the A-expression A . . 

If the string "APPLY." is at the left of the control 
string, a pointer p to a A-closure *«** *■■ a * tne left °* *** e 
result string, and k is the index of the most recently added 
environment component (rule 8): 



•in the discussion to follow, unless explicitly stated 
otherwise, the elements referred to at the left of the 
control string are assumed to be deleted from the control 
string after being evaluated. 
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(a) a new component (lk-«-j v=p ' ) > where v is the hound 
variable of the X-expression A. and p' is a pointer 
to the operand to which the X-expression X. has 
been applied, is prefixed to the environment string. 
(This action results in setting the proper environ- 
ment for evaluating the body of the X-expression A. . ) 

(b) The head of the body of the X-expression X. and a 
marker | are prefixed to the control string, and 

(c) the pointers p and p* to the X-closure and its 
operand are deleted from the result string and the 
marker | is prefixed to the result string. 

If a marker | is at the left of the control string and 

a pointer p and marker | are at the left of the result string, 

the markers are deleted and the pointer p is left on the 

result string. The pointer will point to the value of apply- 
ing the X-expression to its operand. 



Evaluation of Variables and Constants (rules k and 6): 

If a variable is at the left of the control string, a 
pointer to the current value of the variable is prefixed to 
the result string (rule ^.l). The pointer is obtained by 
(a) obtaining the index j of the current environment and 
marking the environment component j with the symbol "°" (rule 
U.3), and (b) then searching (rules k.l and k . 2 ) through the 
environment components linked to j for the occurrence of the 
variable. 

If a constant is at the left of the control string (rule 
6), a new store component containing the constant is pre- 
fixed to the store string, and the pointer to the new store 
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component is prefixed to the result string.* 

Evaluation of Label References (rules 5): 

If a label reference .SL is at the left of the control 
string (rules 5)> each environment component linked to the 
current environment component is searched for the occurrence 
of a component such that the A-expression whose environment 
is specified by the component contains a body that is a 
sequence containing the label. If the label is found, a 
new store component h«j containing the head of the expression 
following the label and the index J of the environment com- 
ponent is prefixed to the store, and a pointer to the new 
store component is placed on the result string. The head of 
the labeled expression and the environment index J provide 
a representation of the label-closure for .1 in that the 
head of the labeled expression uniquely identifies the labeled 
combination and the index J uniquely identifies the current 
environment of the sequence within which the combination 
occurs. 

Transfer of Control (rule 10): 

If the string "GOTO. APPLY." is at the left of the con- 
trol string and a pointer p to a label closure he., where 



•In the evaluator, all constants that are extended Markov 
algorithms must be enclosed by the quotation marks • and 
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h is the head of a labeled expression and J is the environ- 
ment within which the labeled expression is to be evaluated, 
is at the left of the result string 

(a) all portions of the control and result strings to 
the left of the Barkers |. are deleted, and 

(b) the head of the expression following the label is 
prefixed to the control string. 

This mechanism results in interrupting the evaluation of the 

current expression and continuing with the evaluation at the 

labeled expression using the environment j specified while 

evaluating the label-closure. 

Application of Constants (rules 9.1 and 9.2): 

If the string "APPLY." is at the left of the control 
string, and two store pointers p and p' to the strings s 
and s' are at the left of the result string, the string s 
is applied to the string s* (presumably s is an extended 
Markov algorithm and s' is the object string to which the 
algorithm is to be applied). The resulting string value is 
placed in a new store component, and the pointer to the new 
component is prefixed to the result string. 

Assignment (rules 7.1 and 7.2): 

If the string "ASSIGH. APPLY." is at the left of the 
control string and two store pointers p and p' are at the 
left of the result string, the string value in the store 
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associated with p is changed to the string value associated 
with p'. 

Addition of New Rules to the Evaluator t 

It may happen that certain source language constructions 
are awkward to define solely within the target language and 
that these constructions can he more easily defined by adding 
new expressions to the target language and new evaluator 
rules to evaluate these expressions. 

The rule applied to evaluate target language expressions 
is specified "by the numerically first rule that is applicable 
to the current string values of the control, result, environ- 
ment, store, and expression strings. By adding a rule to the 
evaluator whose left part specifies a configuration of the 
control, result, environment, store, and expression strings 
that, for the given configuration, provides a different trans- 
formation from the initial evaluator rules, the evaluator can 
be extended to define new types of target language expres- 
sions. 

Generally, the rule applied by the evaluator is deter- 
mined by the element at the left of the control string. For 
example, in the definition of indirect addressing in SHOBOL/l, 
it was desired to add a rule to the evaluator that would take 
some string value given in store and prefix the string value 
to the control string. The string value prefixed to the control 
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string would then be evaluated in subsequent transformations 
as if the string value were itself a variable . By (a) allow- 
ing expressions of the form "(LOOKUP. X)", where X is a 
variable, in the target language translation of SH0B0L/1, and 
(b) adding the rule 



LOOKUP. APPLX, 
P 

(P.s) 



to the evaluator, the extended evaluator defines indirect 
addressing. None of the initial evaluator rules are appli- 
cable to a configuration where the string "LOOKUP." is at the 
left of the control string; hence the rule can be placed in 
any numerical position within the initial sequence of rules. 

3. Discussion 

This chapter has presented a formally based target lan- 
guage in which the semantics of a computer language can be 
defined. The semantics of the target language was, in turn, 
defined in terms of the formalism of extended Markov algorithms 
by giving an extended Markov algorithm definition of a machine 
for evaluating target language expressions. 

If used as a target language for the implementation* of 



•Extended Markov algorithms have been implemented in the 
source language PAHOH-IB. 11 * 1 ^ 
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a computer language, the target language allows the simple 
addition of built-in machine primitives. For example, if a 
computer has a built-in primitive for computing the sum of 
two integers, there is no need to define this primitive in 
the target language. This primitive can be used as a constant 
in the target language and in applying the primitive to its 
arguments the machine algorithm can be used. The point of 
using only extended Markov algorithms to define primitive 
functions is that for implementation of the target language 
the only necessary machine capability is that for implement- 
ing extended Markov algorithms. The fact that a given 
machine has certain built-in primitives simply relieves the 
person defining the semantics of a source language of defin- 
ing the semantics of the built-in primitives in terms of 
extended Markov algorithms. 

The target language is undesirable in one important 
sense. The computer language constructions for defining the 
assignment of new values to variables and for defining the 
transfer of control within a program required the addition 
of new expressions to the combined formalisms of extended 
Markov algorithms and the X-calculus. The new expressions 
add to the complexity of the target language and place re- 
strictions on the applicability of any theorems developed for 
X-calculus expressions. This undesirable feature of the 
target language is, in part, redeemed in that the evaluator 
for the target language was completely defined within the 
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formalism of extended Markov algorithms. Hevertheless , this 
deficiency of the target language remains and I hope that 
future research will resolve this difficulty. 

On the other hand, the target language is sufficient to 
define the semantics of both SH0B0L/1 and AL00L/60. The pre- 
sentation of the syntax and semantics of these two languages 
will comprise the next tvo chapters of this dissertation. 



Bo you know who Kohmar Pehriad was? 

Hint: He has certainly left his mark on history. 
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CHAPTER IV 

A DEFINITION OF THE SYNTAX AND 
SEMANTICS OF SNOBOL/1 

In this chapter I attempt to demonstrate the thesis of 
this dissertation, that there should be formal definitions 
of the syntax and semantics of computer languages . As an 
example computer language, I have cho*en, SNOBOL/1, as initially 
defined by Farber, Griswold and Polonsky. SNOBOL/1 was 
chosen as an example because (a) the language is simple 
enough to describe conveniently in a single chapter of this 
dissertation and (b) the language is fairly well-known. No 
knowledge of SNOBOL/l will be assumed in this chapter. Rather, 
it is the intent of this chapter to define every construct 
(except character spacing) in the language. The definition 
of SNOBOL/1 will be in two parts: (a) an informal description 
of the language and of the techniques used in the formal de- 
finition in this chapter UBing the English language and (b) a 
formal description of the language in Appendix 3 using the 
formal system. 

This chapter and the formal description of Appendix 3 
may be viewed as a reference manual for SNOBOL/l. It is in- 
tended for a user who wishes a detailed description of the 
language. 

The fa-mal definition of SNOBOL/1 is divided into three 
parts. Appendix 3.1 gives the canonical system defining the 
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syntax of SN0B0L/1, Appendix 3.2 gives the canonical system de- 
fining the translation of SN0B0L/1 into the target language, 
and Appendix 3.3 gives the definition of the primitive func- 
tions used in the target language. In writing the formal 
definition of the SNOBQL/1, it was necessary to resolve a 
few issues that were ambiguously or incompletely defined by 
the English language definition of the language given by 
Farber, Griswold and Polonsky.* 

Introduction to SNQBOL/1 

SN0B0L/1 is a language for defining transformations on 
strings of symbols. Programs in SN0B0L/1 are comprised of 
a linear sequence of rules of which there are four varieties: 
"input"rules for obtaining strings of symbols from some 
external input device (like a teletype or card reader), 
"assignment" rules for assigning names to strings, "pattern 
matching" rules for transforming strings into new strings, 
and "output" rules for writing strings on some external out- 
put device (like a teletype or card reader). In general, 
the behavior defined by each rule is executed in linear 
order. However, rules can be labeled with names and the 



*For example, it was not clear whether the authors meant to 
permit or prohibit the use of the same variable name to 
denote different types of variables in a single pattern 
matching rule or whether to permit or prohibit the use of 
a name both as a string name and a label in the same pro- 
gram. I decide to prohibit the first of these construc- 
tions and to permit the second of these constructions. 
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ordinary sequence of execution interrupted and continued at 
some other labeled rule. 

Introduction to the Techniques Used in Describing SN0B0L/1 

The parts of this chapter will each describe some con- 
struct in the SNOBOL/l, e.g., a string, an arithmetic expres- 
sion, a rule, or a statement. Each of these parts will con- 
sist of (a) portions of the productions from the canonical 
Bystem of the translation (Appendix 3.2) of SNOBOL/l, (b) 
examples of the SNOBOL/l constructs and their corresponding 
target language translations, and (e) an English language 
explanation of these constructs and their semantics as de- 
fined in the target language. 

Theoretically, the (abbreviated) canorioaL system of the 
translation of SNOBOL/l must be combined with the canonical 
system of the syntax of SNOBOL/l to obtain the complete 
canonical system defining the set of legal programs and their 
target language translations. Nevertheless, except for the 
context-sensitive requirements on SNOBOL/l, the abbreviated 
canonical system of the translation of SNOBOL/l provides a 
synopsis of a context-free specification of the language and 
its semantics in terms of the target language. Accordingly, 
the productions from the (abbreviated) canonical system of the 
translation will be used in the text to define the syntax 
and semantics of SNOBOL/l, and the specification of the 
context-sensitive requirements on syntax will be discussed at 
the end of the chapter. 

105 



As mentioned in the previous chapter, the first term of 
each term tuple in the specification of the translation of a 
language is generally of the form M s..t w where "s" represents 
some string in the source language and "t" represents the 
corresponding target language translation. The example 
SN0B0L/1 strings and their target language translations 
given in the text follow this notation. 

Strings 

DIGIT<0>,<1> ... ,<9>; 

LETTER< A> , <B> ... , < Z> ; 

MARK<*>, <.>,<=>, ... .</>; 

LIGIT<p> | LETTER<p> f MARK<p> -*■ BASIC SYMBOL<p> ; 

BASIC SYMBOL<b> •> 6TRING<SEQU )> i 

Example Strings: 

ABC123* A ROSE IS A ROSE 

HESSE , KAFKA, MANN ALPHA 

The "basic symbols in SNOBOL/l are the decimal digits, 
the capital English letters, and a variety of other symbols 
like "*", ■.■ and "«". A string, the basic data type, con- 
sists of any linear sequence of basic symbols. 

Names 

DIGIT<p> | LETTER<p> •*• NAME<p> ; 

NAME<m>,<n> •*- NAME<mn> ,<m.n> ; 

NAME<n> -► STR NAME<n. .n> ,<$n. . (LOOKUP. n)>; 

NAME<n> •*■ VAR NAME<n> ; 

NAME<n> -► BACK REP NAME<n>; 
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Example Names: 



ALPHA 


123U 


ABC.EFG 


12.3 


$BETA 


$123^ 



A string can be assigned a name and the name used in 
place of the string. A name consists of a sequence of decimal 
digits and English letters, possibly including medial periods. 

Besides designating a string, a name can be used in two 
other contexts, that of a string "variable" and that of a 
string "back reference." These three uses of names shall be 
distinguished by calling a name that designates a string a 
"string name," a name that designates a variable a "variable 
name," and a name that designates a back reference a "back 
reference name." A string name is treated as a variable in 
the target language. 

A string name can be indirectly referenced by prefixing 
a string name vith a dollar sign. The string value of a 
string name prefixed by a dollar sign is the string whose name 
is the string value of the name prefixed by the dollar sign. 
For example, if the string value of the name "BETA" is the 
string "A ROSE IS A ROSE" and if the string value of the name 
"A" is the string "BETA", the string value of "$A" is the 
string "A ROSE IS A ROSE". The primitive function "LOOKUP." 
is used to handle indirect addressing in the target language. 
"LOOKUP." is defined by an extended Markov algorithm substi- 
tution rule (Appendix 3.3d) that must be added to the target 
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language evaluator.* When evaluated, this substitution rule 
inserts the string value of a name at the left of the control 
string. Thus the string is treated as if itself were a 
variable to be evaluated in subsequent steps taken by the 
evaluator. 

Arithmetic Expression 8 



DIGIT<d> ■+ DIGIT STR<SEQ(d)>; 

DIGIT STR<s> ■*■ INT<s>,<-s>; 

INT<i> -► AHITH OPEBAHD<*i' , .. , i , >; 

STR NAME<n..n»> ■+ ARITH OPERAUD<n. .n' > ; 

ARITH QPERAND<a. .a'>,<b. .b'? -»■ ARITH EXP<a+b. . ( + (a* ,b' ) )> , 

<a-b. . (-(a' ,b' ))>, 
<a»b. . (#(a' ,b' ))> , 
<a/b..(/(a',b«))>; 



Example Arith Operands; 

"65*.. '65' 
"-65*.. '-65' 
A. .A 



Example Arith Expressions: 



A+B 

A+ r 



.. (+(A,B)) 
65'*. .( + (A, '65' 



)) 



A»"-65*\ .(»{A, '-65')) 



SN0B0L/1 allows a limited type of arithmetic on strings 
whose contents are integers. An integer can be used directly 
as an arithmetic operand by enclosing the integer in the 
quotation marks and . A name whose string value is an 
integer can also be used as an arithmetic operand. An 



*As mentioned in the chapter describing the target language 
evaluator, it may occasionally be convenient to define some 
source language constructs by adding rules to the evaluator 
rather than by defining the constructs solely within the 
target language. To define indirect addressing in the target 
language would require complicated additions to the canonical 
system of the translation of SN0B0L/1 
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arithmetic expression consists of an arithmetic operand 
followed hy one of the arithmetic operators "+", "-*', "»". 
and "/" (defined in Appendix 3.3b) followed by another arith- 
metic operand. The string value of an arithmetic expression 
is the string computed by applying the arithmetic operator 
to the integer value of the two operands. 

String Expressions 

STRING EXP<A. . *A'>; , , 

STRING<S> * STRING EXP< s . .'s»>; 

STR NAME<n..n«> - STRING EXP<n..n'>; 

flBTTH EXP<a a'> + STRING EXP<a..a'>; 

£5!g 5£;:"-'>.<t..t<» - STRING EXP<s0t..((CAT ■•) f )> J 

Example String Expressions: 

A , A i NAME REVERSE.. ((CAT NAME) REVERSE) 

«ABC123**..'ABC123* • *ABC« A. . ( (CAT 'ABC') A) 
A>#A X Y Z..((CAT ((CAT X) Y)) Z) 

$A.. (LOOKUP. A) 

A string expression in SN0B0L/1 is an expression whose 
value is a string. A string can be used directly in an arith- 
metic expression by enclosing the string in the quotation 
marks * and *. A string name or arithmetic expression can 
also be used in a string expression. A sequence of string 
expressions each separated by one or more spaces* comprises 
a complete string expression. The value of a string expres- 
sion is the string computed by concatenating the string values 
of each of the component string expressions. 



•The symbol "a" denotes one or more spaces. 
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Patterns* 

STRIHG<s> ■*■ PAT EXP^s**. . »s'>; 

STR HANE.<n..n'> ■* PAT EXP<n..n'>; 

VAR NAME<n> + PAT EXP:SPECS<»n». . »n' : neSTR|>;» 

VAR NAME<n> * PAT EXP:SPECS<»(n)«. . «n' : neBAL STR|>; 

VAR HANE<n>, DIGIT STR<d> -► PAT EXP:SPECS<«n/d». . 'n' : 

(n.d)eFIX LH STR|>; 
BACK REP HAME<n> -► PAT EXP<n.. , n'>; 

PAT EXP<p..p'>,<q..q«> -► PAT EXP<p q..((CAT p») q')>; 
PAT EXP<p..p'> * PATTERH<p..p»>; 



Example Patterns: 

** ABC ••..•ABC 

X Y..((CAT X) Y) 

»»AME».. •SAME* : HAMEeSTR | 

•HAME» »*..(( CAT •HAME») ',') : NAMEeSTR I 

• X. '•ABC*»(Y)...((CAT((CAT »X' ) »ABC»)) »Y«) : XeSTR I YeBALSTR 

•X» Y X. .((CAT((CAT 'X') Y) »X») : XeSTR | 

A pattern in SHOBOL/1 is the basic unit through which 
string transformations are accomplished. A pattern can he 
viewed as an expression representing a set of strings. 

A string enclosed by quotation marks is a pattern expres- 
sion representing the set of strings containing one member, 
the string itself. A string name is a pattern representing 
the set of strings containing one member, the string value of 
the string name. A variable name enclosed by asterisks is a 
pattern expression representing the set of all strings of 
basic symbols. A variable name enclosed by parentheses and 
further enclosed by asterisks is a pattern expression repre- 
senting the set of all strings containing balanced pairs of 



•The use of the auxiliary term for the predicate part "SPECS" 
will be discussed shortly. 
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parentheses. A variable name followed by a slash and a 
positive integer and enclosed by asterisks is a pattern expres- 
sion representing the set of all strings whose number of basic 
symbols is given by the integer following the slash. A name 
that occurs elsewhere in a pattern as a variable name is a 
pattern expression representing the same set of strings re- 
presented by the variable name. A name used in this context 
is called a back-referenced name. 

A sequence of patterns of pattern expressions each 
separated by one or more spaces comprises a complete pattern. 
A sequence of pattern expressions represents the set of all 
strings composed by concatenating representative strings from 
each of the sets represented by the component pattern expres- 
sions. This set is restricted in that a string used in 
place of a back reference name must be identical to the 
string used in place of the corresponding variable name. 

A pattern is used. to scan a given object string for the 
existence of one of the strings represented by the pattern. 
If more than one string represented by the pattern occurs 
within the object string, the member M such that (a) each of 
the strings (except the last) concatenated to form M is, from 
left to right, as short as possible and (b) the last string 
concatenated to form M is as long as possible is taken as the 
occurrence of the pattern in the object string. 
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Pattern Matching Rules 

STR NAME<n..n'>, STR EXP<s..s'>, PATTERN : SPECS :VAR REFS 
<p..p':c:v> •*• PAT MATCH RULE<n(7p=s. . 
(MATCH_AND_ASSIGN(n l ,p' ,Xir.s' , 'c' ' (v) '> • 



Example Pattern Matching Rules: 

X ** ABC **=..( MAT CH_AND_AS SIGN (X, 'ABC, Air.'A', 1 ', »()')) 
X »NAME» *,*=..( MAT CH_AND_ASS I GN(X, ((CAT 'NAME') »,') 

.Xtt.'A', 'NAMEeSTR |», '(NAME,)')) 
X ALPHA = BETA..(MATCH_AND_ASS1GN(X, ALPHA, Att.BETA,'', »()•)) 

A pattern matching rule consists of a string name followed 
by pattern, an equal sign, and a string expression. The execu- 
tion of a pattern matching rule results in the following se- 
quence of actions: 

(a) The string value of the string name is scanned for 
the occurrence of the pattern. 

(b) If the occurrence of the pattern is found 

(i) each string variable in the pattern is 

assigned the value of the substring used 
in matching the variable to the object 
string, 

(ii) the string expression is evaluated (using 

the new values of the string variables), and 

(lii) the occurrence of the pattern in the object 
string is replaced by the string value of 
the string expression and the string name 
is assigned the value of this newly formed 
string. 

(c) If the occurrence of the pattern is not found, no 
action is taken. 

The pattern matching capability of SNOBtDL/l. is Tianaied 

in the target language through the function "MATCH_AND_ASSIGN" , 
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(see Appendix 3.3c) which essentially forms an extended Markov 
algorithm that reflects the same transformation defined by 
the pattern. In the formation of the extended Markov algo- 
rithm, the variable and back reference names are treated as 
extended Markov algorithm string variables. Hence the trans- 
lation of a variable or back reference name is given as a 
constant (see definition of patterns given previously), the 
variable names are specified as extended Markov algorithm 
string variables representing members of one of the sets 
"STR", "BAL STR", and "FIX LN STR" (see the auxiliary term 
for the predicate part "SPECS" in the definition of a pattern) 
defined in Appendix 3.1a, and the lists of variable names* 
and their set specifications are passed as arguments to the 
function " MAT CH_AND_AS SIGN" . The evaluation of the function 
" MAT CH_AND_AS SIGN" results in the following actions: 

(a) An attempt is made to match the pattern to the 
object string. 

(b) If a match is found, the values of the variables 
are updated, the value of the string expression 
is computed, the name to which the pattern has 
been applied is updated to its new value, and the 
string "TRUE" is returned. 

(c) If no match is found, the string "FALSE" is re- 
turned. 



"The list of variable names is given by the auxiliary term 
for the auxiliary predicate part "VAR REFS" generated in the 
canonicalsystem for the syntax of SN0B0L/1. This auxiliary 
term is also generated in the complete (unabbreviated) 
canonicaisystem of the translation of SN0B0L/1 and is used 
to specify the translation of SN0B0L/1 as indicated above. 
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Input Rules and Output Rules 

PATTERN : SPECS : VAR REPS<p . . p ' : c : v> 

+ INPUT RULE<SYS .READ p.. (MATCH AND ASSIGN 
(READER#,p',Xir.«A , , , c',(v),'v'T)>:"" 
STRING EXP<s..s»> ■* OUTPUT RULE<SYS .PRINT s.. 

(PRINTER* ASSIGN. ((CAT PRINTER*) s'))>; 



Example Input and Output Rules: 

SYS .READ «X« ..(MATCH_AND_ASSIGN( READER*, «X», Xir.'A', 

'XeSTR |','(X,)\ 'X,')) ; 
SYS .PRINT REVERSE.. (PRINTER* ASSIGN. ((CAT PRINTER*) REVERSE)) 

An input rule consists of the string "SYS .READ" followed 
by a pattern. An output rule consists of the string 
"SYS .PRINT" followed by a string expression. 

The input and output of strings from some external input 
device is defined in the target language by assuming that 
there are two system variables "READER*" and "PRINTER*" that 
contain the initial values of the input and output strings." 
When a string is input into a program, the value of the system 
variable "READER*" is changed to the string computed from the 
current value by deleting the string to be read in, and the 
values of the string variables in the pattern are updated. 
The pattern matching and updating of variables are handled 
through the function " MAT CH_AND_AS SIGN" described previously. 



•The initial values of these variables can be added to the 
initial environment named A l in the target language evaluator. 
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When a string is output from a program, the value of the 
system variable "PRINTER*" is updated by appending the string 
value of the string expression. 

Assignment Rules 

STR NAME<n..n»>, STR EXP<s..s»> ♦ ASSIGN RULE 
<n«s..(n' ASSIGN. s')>; 

Example Assignment Statement: 
REVERSE » X REVERSE.. (REVERSE ASSIGN. ((CAT X) REVERSE)) 

An assignment rule consists of a string name followed 
by an equal sign and a string expression. The execution of 
an assignment rule results in assigning the string value of 
the string expression to the string name. 

Rules 

PAT MATCH RULE<r,.r«> | INPUT RULE<r..rV> | OUTPUT RULE<r..r»> | 

ASSIGN RULE<r..r»> * UNLABELED RULE<r..r'>; 

UNLABELED RULE<r..r«> + RULE<«r . .r •> ; 

UNLABELED RULE<r..r«>, NAME<n> * RULE<nOr. .OnOp >i 

Example Rules: 

NAME - NAME REVERSE. . (REVERSE ASSIGN. (( CAT NAME) «™«> , 
LU NAME - NAME REVERSE.. Lk t (REVERSE ASSIGN. ((CAT NAME) REVERSE) 

A rule must be prefixed by a sequence of blank space* or 
a name. A name prefixing a rule is called a label and i* 
used to identify a rule when the normal order of evaluation 
is to be interrupted and to be continued at the labeled rule. 
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Statements 

NAME<n> -► LABEL EXP<n.. .n>; 

STR NAME<n> -► LABEL EXP<$n. . (LOOKUP. ((CAT '.') n))> 

RULE<r..r»>, LABEL EXP<A. . I »> ,<m. .m'> 
•*■ STM<r..r'>,<r/(£)..r';(GOTO. £»)>; 

<r/S(A)..p» => (GOTO, % • ) ELSE =* »A'>, 
<r/F(m)..r' => 'A' ELSE =£ (GOTO. m')>, 
<r/S(£)F(m)..r' =* (GOTO. A') ELSE => (GOTO. m , )>, 
<r/F(m)SU)..r» =? (GOTO. *••) ELSE =s» (GOTO. m')>- 



Example Statement: 

L3 REVERSE - *,*» NAME REVERSE /(L2) .. 

L3: (REVERSE ASSIGN. ((CAT((CAT «,•) NAME)) REVERSE)); 
(GOTO. .L2) 

A label expression in SNOBOL/l is an expression whose 
string value is a label. A label can be referenced directly 
by giving the name of a label or by giving a string name whose 
value is a label and prefixing the string name by a dollar 
sign. 

A statement consists of one of the strings "r", "r/(0" 
"r/S(A)\ »'r/F(m)«, "r/S( I ) F (m)» , or "r/F(»)sU)", where r is 
a rule and I and m are label expressions. The execution of 
a statement of the form "r/U)" results in executing rule r 
and then transferring control to the statement designated by 
the label expression I. The execution of a rule of the form 
"r/S(A) n results in evaluating rule r and then transferring 
control to the statement designated by the label expression 
I if the rule (presumably a pattern matching rule or input 
rule) succeeded in matching the pattern in the rule to its 
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object string. Similarly, a statement of the form r/F(m) 
results in transferring control to the statement designated 
by m if the execution of rule r failed to match the pattern 
in the rule to its object string. Finally, statements of 
the form "r/S ( i, )F(m)" or "r/F(m)SU)" result in transferring 
control to one of the statements designated by £ or m if the 
execution of rule r succeeded or failed in matching its pattern 
to its object string. 

Statement Sequences* 



STM<s. .s*> 

STM SEQ<q. .q'5», STM<s..s , > 

STM SEQ<(i. .q" 



'>, STRING<s> 



STM SEQ<s..s'> ; 

STM SEQ<q»s. .q' ;s»> j 

STM SEQ<q»»s. ,q*> ,<»sVq. .q'> ; 



Example Statement Sequence: 

LU REVERSE ■ X REVERSE 
SYS .PRINT REVERSE 



iA: (REVERSE ASSIGN. ((CAT X) 
REFERSE)); 
(PRINTER* ASSIGN .( (CAT 
PRINTER*) REVERSE)); 



A statement sequence consists of a list of statements 
each on a new line. The statements are executed in order 
unless a statement explicitly specifies a transfer of control. 
Arbitrary character strings prefixed by an asterisk can be in- 
serted among statements. The character strings provide com- 
ments for the programmer and are not evaluated. 

•The symbol ">" denotes a new line. 
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SHOBOL/1 Programs * 

STM SEQrSTR REPS<«1. .q' :s >, HAME<n> , LIST:BYS.:CORR NULL LIST 

<s :v.:A> -»■ SNOBOL PROQRAM<q BUD n..LET v -A IN (GOTO, 'n'); g.'> 
r d d 



Example Program: 



LI 


SYS 


L2 


X 


L3 


REVERSE 


LU 


REVERSE 




SYS 


END 


LI 



.READ »X» 

•NAME. V - /s(L3)F(LU) 

- **,♦» NAME REVERSE /(L2) 

■ X REVERSE 
.PRINT REVERSE 



Translation: 

LET X, NAME .REVERSE - 'A»,'A','A' 
IN (GOTO. .Ll); 

Ll! (MATCH AND ASSION(«BADER#, »X',Xir. 'A'.XeSTR \\ •(X,)')); 

L2: CmATCH""AND^ASSIGN(X,((CAT 'NAME) ','),Xir.'A', 



•.•(NAME,)')) 
GOTO. .LU); 



•NAMEeSTR 
■=> (GOTO. .L3) ELSE s» 
L3: (REVERSE ASSIGN. ((CAT ((CAT »,') NAME)) REVERSE)); 

(GOTO. .L2); 
Lk: (REVERSE ASSIGN, ((CAT X) REVERSE)); 

(PRINTER* ASSIGN. ((CAT PRINTER*) REVERSE)); 



•Like the list of variable names, the list of string names 
used in a SNOBOL/l is generated in the canonical system for 
syntax and is used in the canonical system for the transla- 
tion to form the list of bound variables for the target 
language translation of a program. 

The predicate H LIST;BVS:.CORR NULL LIST" names a set of 
ordered triples, vhere the first element of each triple is 
a list of names (e.g., X, Y..X, ALPHA, Y, ), the second element 
Is a name list containing one occurrence of $ach name in 
the first list (e.g.. X, . Y .ALPHA), and the third element is 
a list of null strings with the same number of elements as 
the second list (e.g., "A", "A", "A" ). This predicate is used 
to set the list of string names in a program to bound vari- 
ables each with the initial value of a null string. 



118 






A SHOBOL/1 program consists of a statement sequence 
followed by a statement of the form "EHD n", where "EHD" is 
a label and "n" designates the label of some statement in 
the statement sequence. The execution of a program begins 
by initializing the string values of the string names in the 
program to null and then executing the statements in the pro- 
gram beginning with the statement labeled by "n". 

The example program above reads in a string from the 
input device and outputs the string computed from the input 
string by reversing the order of each substring separated by 
a comma. For example, if the string "HESSE, KAFKA, MAHH" 
is on the input device, the string "MAHH, KAFKA, HESSE" is 
printed on the output device. 



Context-Sensitive Requirements on the Syntax of SHOBOL/1 

There are a few context-sensitive requirements on the 
syntax of SHOBOL/1: 

(a) The variable names in a pattern must each be 
different* 
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(b) The back-reference names in a pattern must be 
identical to the variable names and different from 
the string names. 

(c) The labels in a program must each be different and 
each reference to a label in a label expression 
must refer to a name that actually occurs as a 
label. 

These requirements are specified in the canontoal system for 
the syntax of SN0B0L/1 by specifying with each construct. 

(a) the lists of names used as string names, variable 
names, and back reference names (productions 3 of 
Appendix 3.1), 

(b) the lists of names used as labels (production 11.3) 
and names used to refer to labels (production 12.1), 

and specifying 

(a) that the list "p » of variable names in a pattern 
must contain names each of which is different (the 
premise "DIFF NAME LIST<r v >" in production 6.8), 

(b) that the list "r " of back reference names in a 
pattern must be contained within the list "r " of 
variable names and that the list "r " of string 
names in a pattern must be disJoint S from the list 

r v of variable names (the premise "LI :L2: INTERSEC 
<r b :r v :r b > ' <r s :r v :A> " in Production 6.8), and 

(c) that the list of labels in a program must contain 
names each of which is different and that each 
label reference must be contained in the list of 
labels (production Ik) . 

The addition predicates M DIFF NAME LIST" and "LI :L2: INTERSEC" 
are defined at the end of Appendix 3.1. 



This chapter has attempted to describe in detail the 
syntax and semantics of SNOBOL/l. It is intended that a 
reader, having digested this chapter, would have sufficient 
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knowledge of SNOBOL/l and its formal definition to be able 
to use the compact, formal definition to answer further 
questions concerning the syntactic legality or meaning of 
a given SNOBOL/l construct. It is hoped that this chapter 
has served that objective. 
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CHAPTER V 

A SPECIFICATION OF THE SYNTAX AND SEMANTICS 
OF ALGOL/60 



This chapter exercises the formal system presented in 
this dissertation to specify the syntax and semantics of 

ALGOL/60, as defined in the official ALGOL/60 report edited 

28 
by Peter Naur. The intent of this chapter is not only to 

explicate the formal specification of ALGOL/60, but also to 
relate the techniques used in the formal specification of 
ALGOL/60 to other languages and to compare the formal system 
presented here to other methods of language specification. 
A knowledge of ALGOL/60 is assumed in this chapter. 

It is surprising that, although ALGOL/60 is the official 
publication language of the Association for Computing Machinery 
and is accordingly widely-publicized, the author knows of no 
implementation of the complete language. Probably the most 
important factor in this circumstance is the complexity of 
ALGOL/60. Indeed, in writing this chapter I frequently found 
myself in the difficult situation of first attempting to under- 
stand ALGOL/60 and then attempting to characterize the language 
with the formal system. There are many interrelated program 
constructions and a complicated variety of restrictions on 
programs that make the language difficult to understand and 
define. Nevertheless, as an example of the formal system, 
applied to a somewhat complex computer language, a specification 
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of the syntax and semantics of ALGOL/60 is presented in Appen- 
dix k.* 

Previous Work by Peter Landin: 

In his paper 21 "A Correspondence Betveen ALGOL/60 and 
Church's Lambda Notation," Peter Landin described the semantics 
of ALGOL/60 in terms of a modified form of Church's A-calculus, 
called "imperative applicative expressions" or "IAEs". The 
target language presented here is similar to Landin' s impera- 
tive applicative expressions in that the A-calculus was 
augmented to directly handle assignment and transfer of 
control features of ALGOL/60. The target language differs 
from imperative applicative expressions in that (a) the 
mechanism to handle transfer of control here is different 
from that of Landin, and (h) Landin's (SECD) machine to 
evaluate imperative applicative expressions is specified by 
a A-calculus expression, vhereas the machine to evaluate 
target language expressions here is specified by an extended 
Markov algorithm. 

The specification of the semantics of ALGOL/60 given 
here is heavily based on Landin's definition. On the other 
hand, the dissertation here not only includes a specification 
of the semantics of ALGOL/60, but also a specification of 
syntax and a definition of the primitive functions used in 



•The specification of character spacing and of the use of 
exponents in numbers is not included. 
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specifying the semantics. The primitive functions used to 
specify the semantics of ALGOL/60 are defined only by example 
in Landin's paper. 

The Syntax of ALGOL/ 60 

The canonical system specifying the syntax of ALGOL/60 
is specified in Appendix U.l. The first term in each speci- 
fied term tuple describes some string in ALGOL/60. If the 
auxiliary predicate parts and terms are deleted from this 
specification, Appendix U.l can be viewed as a partial (context- 
free) specification of the syntax. A context-free specifica- 
tion of ALGOL/60' s syntax exists in the ALGOL/60 report and 
the specification of Appendix U.l closely parallels the 
specification in this report. Although it does not completely 
specify the syntax of the language, the context-free specifi- 
cation of ALGOL/60 is fairly straight-forward and the presen- 
tation of the canonical system of ALGOL/60 will therefore 
focus on the context-sensitive requirements. 

Context-Sensitive Requirements on the Syntax of ALGOL/60 

There are myriad context-sensitive requirements on the 
syntax of ALGOL/60. Among these requirements are 

(a) The type of each identifier in a program must be 
declared. 

(b) An identifier cannot be used in conflicting con- 
texts in the same block. There are many variants 
of this requirement. For example, an identifier 
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used as a real variable in a block cannot be used 
as a boolean variable, an array identifier, a pro- 
cedure identifier* or a switch identifier. 

(c) Any use of an array identifier must occur with a 
subscript list of the same dimension as that of 
the bound pair list in the array declaration. 

(d) The bound pair list in an array declaration can 
depend only on variables that are non-local to the 
block in which the array declaration is given. 

(e) All statement labels in a block must be different. 

(f) The uses of actual parameters in a function desig- 
nator must be compatible with the uses of the cor- 
responding formal parameters in the procedure 
declaration. There are many , many variants of 
this requirement. For example, an actual parameter 
that is declared to be a real variable cannot cor- 
respond to a formal parameter that is used as a 
boolean variable, an actual parameter that is a 
procedure identifier must correspond to a formal 
parameter that is used with arguments that are 
consistent with the procedure declaration, and an 
actual parameter that is an arithmetic expression 
cannot correspond to a formal parameter that is 
called by name and assigned a value in the procedure 
declaration. 

The context-sensitive requirements on the syntax of 

ALGOL/60 occur in many other computer languages besides 

ALGOL/60. The restriction (a) that the type of each identifier 

must be declared occurs in many computer languages. For 

example, in PL/1 each occurrence of an identifier used to 

name an object must be declared, either explicitly, contextually , 

or implicitly. An explicit declaration of an identifier is 

given through a DECLARE statement, whereby an identifier is 

given an attribute restricting the use of the identifier to 

statements operating on certain classes of data, e.g., fixed 

point numbers, character strings, or files. A contextual 
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declaration of an identifier is given when an identifier 
occurs in a context where only one class of data objects can 
occur, e.g., in the statement "GET FILE (X) DATA" the identi- 
fier *'X" is contextually declared as a member of the class 
file in that only a file name can occur after the string "GET 
FILE" in a GET statement. An implicit declaration of an 
identifier is given when an identifier is associated with 
other declared identifiers (e.g., in the statement 
"T = A » B", if "A" and "B" are declared as fixed point num- 
bers, the identifier T may be implicitly declared as a fixed- 
point number). Programs not specifying a unique declaration 
for each jidentifier are illegal. 

The restriction (b) that identifiers cannot be used in 
conflicting contextx occurs in almost every language where dif- 
ferent classes of data objects are distinguished. For example, 
although PL/l allows some identifiers to be used in different 
contexts, many contexts of declared identifiers are considered 
illegal, e.g., if "X" is explicitly declared as a bit string, 
the statement "GET FILE (x) DATA" is illegal since the GET 
statement contextually declares "X" as a file. 

The restriction (e) that all statement labels in a block 
must be different occurs in almost every language allowing 
statements tb be labeled and control to be passed to a labeled 
statement. The labels must be different in order for the 
destination of the transfer of control to be unique. For 
example, in Fprtran IV no two statements may be labeled with 
the same statement number. 
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The restriction (f) that corresponding actual and formal 
parameters must he compatible likewise occurs in many lan- 
guages and can become complicated, especially in languages 
allowing nested procedure definitions and applications like 
ALGOL/60. 

The author knows of only one major computer language 
where a complete formal specification of its syntax has been 

given. In particular, the simulation language GPSS has been 

3 
specified completely by Donovan, using canonic systems. 

Otherwise, the syntax of many computer languages has been 
specified either informally or has been partially formalized, 
usually with a context-free grammar. 

Before discussing the specification of the context- 
sensitive requirements on the syntax of ALGOL/60, the reader 
is reminded that the auxiliary predicate parts and terms in 
a production generally specify the lists of identifiers, 
labels, variables, etc., that are used within the source 
language string specified by the first term in the production. 
These lists will be referred to repeatedly in the productions 
to follow. 

Specification of the Requirement that the Type of Each Variable 
Must be Declared: 

Consider the (abbreviated) production* from the canonical 



"The productions g'iven in the text will generally be only por- 
tions of the corresponding productions given in Appendix h. 
Portions of productions are given in the text to illuminate 
better the particular construction under discussion. An expli- 
cation of the complete canonical system for ALGOL/60 will be 
given later in the chapter. 
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system of the syntax of ALGOL/60: 

ID<i> -* REAL VAR:R VARS<i:i,>; 

If "i" designates a string that is an identifier, the term 
tuple "<i:i,>" designates a pair where the first element is 
an identifier used as real variable, and the second element 
designates the addition of the identifier to the list of 
identifiers used as real variables in a program. Consider 
also the production 

IDLIST<*> -f TYPE DEC:DEC R VARS<REAL l:l t > ; 

If "*" designates a string that is a list of identifiers, 
the term tuple "<REAL l;l,>" designates a pair where the first 
element is an ALGOL/60 declaration of a list of identifiers 
as real variables, and the second element designates the addi- 
tion of the list of identifiers to the list of identifiers 
declared as real variables. 

Next consider the production 



STM SEQ:R VARS<s:v r ?-, DEC SEQ:DEC R VARS<d:v rd >, 

L1:L2:REL COMP<v :v .:v»?- 
r rd r 

■* BLOCK:R VARS<BEGIN d;s END:v«>; 

r ' 



Here, if 



(a) "s" is a statement sequence with a list "v " of 
identifiers Used as real variables r 

(b) M d" is a declaration sequence with a list "v " of 
identifiers declared as real variables 
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(c) "▼•" is the list computed from "v r " and "v rd " b y 
fofming their relative complement (i.e., n v r - v rd ) 

then 

(d) "BEGIN d;s END" is a block with a list "▼£" of 
identifiers that are used as real variables in the 
block but not declared within the block 

Finally, consider the production 

PROGRAM STR:R VARS<p:A> ■*■ ALGOL PROGRAM<p> ; 

Here, if (a) "p" is a string that is in the form of a program 
and (b) the list "R VARS" of identifiers that are used in the 
program as real variables but are not yet declared is given 
as null, then the string "p" is specified as a bone fide legal 
ALGOL program. 

In this manner (a) each identifier in a program used as 
a real variable is added to the list of used real variables, 
(b) each identifier declared as a real variable is added to 
the list of declared real variables, (c) each identifier de- 
clared in a block as a real variable is removed from the list 
of identifiers used as real variables, and (d) a string is 
specified as a legal program only if the list of used (but 
as yet undeclared) real variables is given as null. 



Specification That Identifiers Cannot be Used in Conflicting 
Contexts : 

Consider the following production 
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STM SEQ:R VARS:B VARS< s : v r : v b > , DEC SEQ<d>, 

DISJ ENTRY LISTS<(v r )(v b )> - BLOCK<BEGIN d;s END> ; 

where the predicate "DISJ ENTRY LISTS" specifies a set con- 
sisting of one or more identifier lists each enclosed in 
parentheses such that each list is disjoint from the others. 
If "v r " and "v b " specify the lists of identifiers used re- 
spectively as real variables and boolean variables, in a 

statement sequence, the premise "DISJ ENTRY LISTS<(v )(v )>" 

r b 

insures that the string "BEGIN d; s END" is a legal block 
only if the lists "v/ and "v b " are disjoint, i.e., not used 
in conflicting contexts. 



Specification That Actual and Formal Parameters Must Be 
Compatible : 

The requirements on the uses of actual and formal para- 
meters of ALGOL/60 procedures is complicated. For example, 
let "P(X,A)" be a declared procedure with two formal parameters 
"X" and "A", where in the declaration of "P", "X" is used as 
a real variable and "A" is used as an integer array of dimen- 
sion three. The function disignator "P(3.1,Q)", where "Q" 
is a declared integer array of dimension three would consti- 
tute a legal activation of the procedure "P", whereas the 
function designator "P(TRUE,Q)" would not be legal since the 
type "REAL" of "X" and the type "BOOLEAN" of "TRUE" are not 
compatible. 
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To specify the context-sensitive requirements on proce- 
dures, a number of additional predicates are defined. For 
simplicity, in the discussion to follow I will assume that 
ALGOL/60 has only three data types: real variables , boolean 
variables, and integer arrays. Consider the following pro- 
ductions : 

DIMM<1> ; 

DIMM<m> ->■ DIMM<ml> ; 

SPEC<REAL> ,<BOOLEAN> ; 

DIMM<m> "► SPEC<INTEGER ARRAY(m)>; 

SPEC<s> -*■ SPEC LIST<s>; 

SPEC<s>, SPEC LIST<£> ->■ SPEC LIST<A,s>; 

Here the predicate "SPEC" specifies a set comprising the 
strings {REAL BOOLEAN INTEGER ARRAY(l) INTEGER ARRAY ( 11 ) 
INTEGER ARRAY (ill) ...}, where each string specifies the use 
of some formal parameter in a procedure declaration. The 
predicate "SPEC LIST" specifies a set where each member is 
a string of parameter specifications each separated by a 

comma. 

For example, if "P" is a procedure declared as above, 
the specification list for the formal parameters of "P" would 
be "REAL, INTEGER ARRAY (ill)". Similarly, if M P(3.1,Q)" and 
"P(TRUE,Q)" are function designators where "Q" is declared 
as an integer array of dimension three, the specification 
list for "P(3.1,Q)" would be "ARITH EXP, INTEGER ARRAY (ill)" 
and the specification list for "P(TRUE,Qj" would be "BOOL 
EXP, INTEGER ARRAY ( 111 )" . In the specification of the syntax 
of ALGOL/60, a predicate "SPEC MATCH" is defined. The ordered 
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pair "<ARITH EXP, INTEGER ARRAY ( 111 ): REAL, INTEGER ARRAY(lll)> w 
ia a member of this predicate, and thus, by using this predi- 
cate as a premise in the canonical system for ALGOL/60, the 
function designator n P(3.1,Q)" is allowed as a compatible 
function designator with the above indicated declaration of 
M P M . On the other hand, the ordered pair w <BOOL EXP, INTEGER 
ARRAY (111): REAL, INTEGER ARRAY ( 111 )>" is not a member of this 
predicate, and thus the function designator *'P(TRUE,Q)" is 
not allowed as a compatible function designator for "P". 

Since the number of data types in ALGOL/60 is much greater 
than the number of types assumed in the examples Just given, 
the actual specification of the context-sensitive requirements 
is much more complicated than indicated in the previous para- 
graphs. A detailed discussion of the complete canonical 
system specification of the context-sensitive requirements 
on ALGOL/60 procedures is given at the end of this chapter. 

The Semantics of ALGOL/ 60 

It seems that much less work in computer science has been 
directed to formalizing semantics than in formalizing syntax. 
While many methods for characterizing (at least in part) the 
syntax of computer languages have been successfully developed, 
few methods for characterizing semantics have reached a 
development where entire languages have been characterized. 

An application of the X-calculus has been used by Peter Landin 21 

25 
and John Wozencraft ' to characterize respectively the seman- 
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tics of ALGOL/60 and the classroom language PAL. The charac- 
terization of semantics given in this dissertation is in 
part based on these efforts. 

A quite different approach to characterizing semantics 
has been taken by the IBM Vienna laboratory, which has under- 
taken the formidable task of characterizing the semantics of 
PL/1. This group has used portions of LISP, the predicate 
calculus, set theory, and other constructs of their own inven- 
tion to characterize the semantics of PL/1. Their work has 
been described in several lengthy IBM technical reports. A 
judgment of the utility of their approach awaits a more 
digestible presentation of the formal system and the tech- 
niques used within the formal system. 

The specification of the semantics of ALGOL/60 in terms 
of the target language presented here is given in Appendix 
k.2. Much of the semantics of ALGOL/60, e .g. , arithmetic 
expressions, boolean expressions, designational expressions, 
conditional statements and statement sequences, are straight- 
forwardly defined in the target language and in part have 
been discussed in previous chapters. I will therefore focus 
the discussion of this chapter on some constructs in ALGOL/60 
whose semantics are not quite as obviously expressed in terms 
of the target language. 

The table on the following pages lists several example 
ALGOL/60 expressions and their translations into the target 
language. In the discussion to follow, the reader may find it 
helpful to refer to these examples. 
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EXAMPLE ALOOL/tO EXPRESSIOBS ABD TRBIR TRABCLATIOIS 
IBTO TEE TABOET LABGUAOE 



Byataetle Type 
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■ DM 

■ OK 
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ID 
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PCI Dia 
pcb dib 
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AIM! HC 
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a» Die 
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ALOOL PROGRAM 
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P 

9lI,t,I,I) 

AaBaC 

IP B TBI! BL8E 1 

ALPIA 

009 

six] 

cobheit tiis 15 
a commit 
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rial x.r.i 
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RIAL ARRAI A[l:10, 
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A[1:10,1:10] 
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Tranalatlon Into th* Tara«t Laagaate 



■65' 

(BEOATE •€}•) 
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A 
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(♦(A.a(B.C))) 
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.9 
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(GOTO. .9) 
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If A and B are Integer van 



»w.STEP()«.X, »•.•!• 
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A 1- I»T 

IBS 

BIOIB REAL A,B| 
RUiPWBBAjMBPtl.r), 
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r !■ !♦!/?; 
A :• 31 

B f A.F(k.A), 
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(ro»(V,DELAt_CATC>".M 1 t »..'2'J.LET "(COBV T0_IIT(»(Y,1 ) ) ) 
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LET BIC X.T-'A'.'A' 
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IB LET a-> IB (a ASSIGB. •) 
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Primitive Functions Used to Define the Semantics of ALGOL/60: 

Appendix U.3 defines the primitive functions used in 
defining the semantics of ALGOL/60. Appendices U.3a and k . 3b 
define miscellaneous primitives, like the function "NEQ" for 
negating a boolean value, the function "HD" for computing the 
head of a list, and the function "ABS" for computing the 
absolute value of a number. Real numbers in ALGOL/60 are 
represented in the target language by their fractional equiva- 
lent. A fraction in the target language is a string of the 
form "xDy", where x and y represent respectively the numerator 
and denominator of the fraction. For example, the real number 
"1.5" in ALGOL/60 is translated into the target-language 
string "3D2" denoting the traction three-halves (3 Divided by 
2). Appendix U.3c defines the primitives "TRANS_INT" and 
"TRANS FRAC" for converting real numbers to their fractional 
representation and the primitives "CONV_TO_REAL" and "CONV_ 
TO INT" for converting integer numbers to real numbers and 
real numbers to integer numbers. Appendices k.3& and **.3e 
define the arithmetic and boolean primitives. 

Appendices l+.3f and *K 3g define the primitives used in 
defining the semantics of for statements and arrays and will 
be discussed later in the text. 

Primitive functions similar to those given for ALGOL/60 
can be used to define the semantics of many languages used 
for numerical processes. For example, in FORTRAN IV, the 
arithmetic and boolean primitives almost exactly parallel 
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those for ALGOL/60. Although FORTRAN IV allows the user to 
(a) specify one of two precisions for real number arithmetic 
and (b) specify arithmetic for complex numbers, these facilities 
can be readily specified in the target language by (a) defin- 
ing a primitive that converts target language fractions to the 
desired precision as real numbers and (b) defining the arith- 
metic operators for complex numbers in terms of those given 
for real numbers. Similarly, the FORTRAN IV facilities for 
arrays and DO statements closely parallel the ALGOL/60 facili- 
ties for arrays and for statements. 

Assignment of Values to Variables and Procedures: 

Consider the following ALGOL/60 assignment statements: 



A 
F 
A 



= X 

= X 

■ F := X 



where "X" is an integer variable, "A" is a real variable, and 
"F" is a real procedure identifier. The corresponding target 
language expressions for these statements are: 

LET it = ( CONV_TO_REAL X) IN LET a = A IN (0 ASSIGN, it) 

LET it = (CONV_TO_REAL X) IN (F# ASSIGN, tt ) 

LET tt = (CONV_TO_REAL X) IN (F# ASSIGN, ir ) ; 

LET a = A IN (a ASSIGN, it ) 
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The expression on the right side of an assignment state- 
ment must be evaluated only once . Therefore, the translation 
of the right-hand expression is evaluated once and is linked 
with the dummy variable "ir" and the value of tt is used in 
each target language assignment expression. The primitive 
"CONV TO_REAL" is applied to "ir" before the assignment to 
convert the value of "tt" to a real number. 

Assignments in the target language can only be made to target 
language variables. The ALGOL/ 60 variables in the left side of the as- 
signment statement are linked with the dummy target language variable 
"a" to handle the case where the ALGOL/60 variable is a formal 
parameter called by name and the ALGOL/60 variable must be 
translated into a target language expression that is not a 
variable. (This point will be discussed shortly.) By linking 
the dummy variable o with the translation of expression re- 
presenting the ALGOL/60 variable, an assignment to o will 
also result in an assignment to the corresponding ALGOL/60 

variable. 

The assignment of a value to a procedure in a procedure 
declaration is handled by affixing the mark "#" to the proce- 
dure identifier and assigning the value of the right-hand 
expression to this newly formed identifier. The "#" is affixed 
to the identifier to avoid conflicts with the use of the pro- 
cedure identifier in a recursive call to the procedure. In 
the translation of the entire procedure declaration, the 
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translation of the last statement in the declaration is 
followed by the statement "F#", where P is the procedure 
identifier. Thus the evaluation of the procedure will return 
the value currently assigned to the procedure identifier. 

Parameters Called by Name and Called by Value: 

Consider the following ALGOL/60 procedure declaration: 

PROCEDURE F(X,Y); VALUE Y- 
BEGIN ' 

Y := Y+Y; 

X := Y*Y; 
END 



In this procedure declaration the formal parameter "X" is 
called by name and the formal parameter "Y" is called by 
value. If "A" and " B " are rea] _ numbers whQSe current v&lu 

'1" and "2", the evaluation of the procedure statement 



es 



F(A,B); 

results in changing the value of "A" to "V while leaving the 
value of "B" unchanged. 

Next consider the following target language translations 
of the procedure declaration given above and procedure state- 
ment "f(a,b)" : 

LET F(X,Y) = LET Y = (UNSHARE (Y 'A')) 

IN LET 7T = ( CONV_TO_REAL (+(Y,Y))) 

IN LET a = Y IN ( a ASSIGN, it ) ; 

LET tt = (CONV_TO_REAL (*(Y,Y))) 

FUT.A, A„.B) IN ^ " = U * A,) IN (° ASSIGN « "> 
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Here, the translations of the actual parameters "A" and "B" 
are given as functions mapping the dummy variable %" into 
the variables of "A" and "B". In the evaluation of the pro- 
cedure statement "F(A,B)", the function "Xtt.B" will be applied 
to the null string (causing the evaluation of "B") and the 
function "UNSHARE" (Appendix U.3a) will be applied to this 
value (causing the formation of a new cell in the store for 
the value of "B". Thus subsequent assignments to the formal 
parameter "Y" will not result in changing the value of "B" . 
On the other hand, the function "UNSHARE" is not applied to 
"X" and the assignment of a value to "X" will result in 
changing the value of the corresponding actual parameter "A". 



Lists in ALGOL/60: 

In defining the semantics of ALGOL/60, it will be con- 
venient to define primitive functions operating on lists of 
strings. I will use the notation 



s l+ S 2+ ••* + S n 



where the s., l<i£n, are strings, to denote a list. 
[ 1» X 2» '*• ' X n 



If 



x x ... X are expressions whose values are the strings 



s l» s 2' •" ' S n' the ex P resslon 



(1) ((CAT ... ((CAT ((CAT ((CAT X ± ) ' + ')) X 2 ) ) ' + ')) ••• X n ) 



will result in forming the list 
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'1+ D 2+ *•* + £ 

The concatenation of expressions to form lists will occur 
frequently in the formal definition of ALGOL/60. For conven- 
ience, I will generally omit the explicit specif icatioh of 
the concatenation of the component expressions of a list and 
write list expressions of the form (l) in the alternate nota- 
tion 

L X 1+ X 2 + '•• + X J 

Arrays and Switches: 

An array in ALGOL/60 is treated in the target language 
as an indexed linear list, where the number of elements in 
the list equals the number of elements in the array. For 
example, an array with a bound pair list 

[1:2,1:3] 

is translated into the string 

(l + l,A) + (l + 2,A) + (l + 3,A) + (2 + l,A) + (2 + 2,A) + (2 +3 ,A) 

vhere the symbol "A" specifies an initial null value for each 
element of the array. The translation of arrays into lists is 
handled through the function »MAKE_LIST" (Appendix lt.3g), which 
converts the bound pair list of the array into a linear list 
of array elements each with an initial null value. An element 
of an array is obtained through the function M GfiT_EL n , 
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(Appendix U.3g), which, given a subscript list and an array 
identifier, obtains the appropriate array element. The 
elements of an array are updated with new values through the 
function "RESET^IST", which resets the value of one of the 
array elements in the array list. 

Switches are also treated as linear lists. For example, 
a switch with a switch list "L.M.N" is translated into the 
target language string "|jl,Xir. .D + (2.X». .M) + (3,Xir. .N)} The 
elements of the target language list are given as dummy 
variable functions so that an element of a switch list is 
not evaluated unless the element is selected by a designa- 
tional expression. The translation of switches into lists 
is handled through the primitive function "INDEX_LIST" (Ap- 
pendix U.3g), which forms an indexed list of switch elements. 
An element of a switch list is obtained by applying functio 
"GET_EL" to the switch list and then applying the selected 
element to the null string. This application results in 
forming the proper label-closure for the label. 

Own Variables: 

Consider the following outlined ALGOL/60 program: 



BEGIN 

REAL X,Y,Z; 

PROCEDURE P(A); BEGIN OWN Xj ... END; 



END 
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n 



and its target language translation 



LET X#l = ' A' 

IN LET REC X,Y,Z,F(A) = ' A' , ' A' , ' A' ,LET X = X#1 IN 
IN 



The variable "X" in the ALQOL/60 procedure "F" is an own 
variable, and hence on successive calls to the procedure "F" 
the value of "X" is not re-initialized to a null value but 
maintains the value last assigned to "X" on the previous call. 
In the target language translation of the program, a new 
global identifier "X#l" is created, and on each call to »F" 
the value of "X" is set to the value of »X#1". In this manner 
an assignment to the value of "x" will also result in an 
assignment to »X#1». Since »X#1» is global to the entire 
target language expression, "X#l" will maintain the value 
last assigned to "X" and subsequent calls to "F" will result 
in resetting "x" to its last assigned value. 

The mark "#" and positive integer are affixed to the 
global own identifiers so that these identifiers will not 
conflict with other identifiers in the target language 
expression. 

Own arrays are treated similarly to own variables in 
that the own array identifiers are coupled with corresponding 
global identifiers. The global array identifiers are ini- 
tialized with null values. Upon each entry to a block with 
an own array, 
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(a) the value of the global array identifier is updated 
to the value computed from the current value of the 
global identifier by (l) retaining the values of 
the array elements whose indices, as specified by 
the current value of the bound pair list, occur in 
the array list for the global identifier, and (2) 
setting to null the values of the array elements 
whose indices do not occur in the array list for 
the global identifier, and 

(b) coupling the value of the own array identifier with 
the value of the corresponding global array identi- 
fier. 

Thus, upon the first entry to the block, each element of the 
own array will be given as null. Since updating the value 
of the local own array identifier will also result in up- 
dating the value of the corresponding global array identifier, 
subsequent entry to the block will result in resetting the 
values of the previously given elements of the own array 
identifier to their previous values and setting the value of 
each array element not included in the previous bound pair 
list to null. 

Own variables and own arrays have generally caused prob- 
lems for those implementing languages with own variables in 
that special programs and storage areas have been needed to 
properly implement own variables. The above mechanism for 
handling own variables in the target language is quite 
straightforward and avoids the complexity generally associated 
with own variables 

Goto Statements: 

A statement of the form "GO TO L" in ALGOL/60, where L 
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is a label reference, will result in interrupting the normal 
order of evaluation and continuing by evaluating the statement 
labeled by L in the same sequence or in the first encompassing 
block containing a statement with a label L. The mechanism 
for transferring control to a target language expression in 
the same or an encompassing sequence has been discussed in 
the chapter III. 

On the other hand, a more complicated situation for 
transferring control occurs when a label is passed as an 
argument to a procedure.* For example, consider the procedure 
statement 

F(L) 

and the procedure declaration 



PROCEDURE F(X); LABEL X; 
BEGIN 



GO TO X; 

END 

Since in the target language, the procedure statement is 
translated as 

F(Xir. .L) 

where the x-closure f or " A tt . . L" is evaluated relative to the 

•Formal parameters that are labels called by v alue are excluded 
according to the ALGOL/60 report. 
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environment vithin which the procedure statement occurs and 
the GO TO statement is translated as 

(GOTO, (X »A' )) 

the label-closure for X will refer to the labeled statement 
in the block in which procedure statement occurs (or to a 
labeled statement in an encompassing block) and the environ- 
ment given by the label closure will refer to the environment 
of the block specified at the time when the procedure state- 
ment was evaluated. 

Furthermore, consider the ALGOL/60 program: 

BEGIN INTEGER A,B; 

PROCEDURE F(l,X); LABEL X; VALUE I 
BEGIN M: B := B+l; 
I := 1+1; 

IF B«l» THEN GO TO LI; 
IF B=3 THEN GO TO X; 
IF B-2 THEN F(l,X); 
IF B-l THEN F(I,M) END F; 

A := B := 0; 
F(A,L1); 
LI: A := A»A 
END 

Here F is a recursive procedure that is called three times. 
On the second call to F the local label M is passed as an 
argument; the label-closure for M will specify an environment 
within which the value of I is 1. On the third call to F the 
GO TO statement "GO TO X" will result in resetting the environ- 
ment within which the value of I is 1, and upon exiting from 
the procedure the value of I will be 2, and not 3. 
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Recursive Definitions: 

ALGOL/60 allows the declaration of variables, arrays, 
switches, and procedures that can depend on each other. For 
example, the following declaration sequence can occur within 
a block 

REAL PROCEDURE Hl(Xl); IP XI- THEN 1 

ELSE Xl«H2(Xl-l); 
REAL PROCEDURE H2(X2); IP X2-0 THEN 1 

ELSE X2»H1(X2-1) 

These declarations constitute a simultaneous recursive defini- 
tion of the factorial function (e.g., the value of the function 
designator "Hl(U) w is "2U"). 

If El, E2, and S are statements, and HI and H2 are proce- 
dure identifiers that are (possibly) defined simultaneously 
recursive, the ALGOL/60 block 



BEGIN 

REAL PROCEDURE Hl(Xl); El; 

REAL PROCEDURE H2(Xl); E2; 

S 

END 



can be correctly defined by the target language translation 

(1) (Xir.(XHl.(XH2.s (HD ir)) (TL w)) (Y 2 XH1. XH2.JXX2. e2 + XXl. eij) ) 

where el, e2, and s are the target language expressions for 

the ALGOL/60 statements El, E2, and S and the fixed point 

2 
operator Y is 

XF. LET irl,ir2*'A',«A' 

IN LET Z«=((F wl) ir2) 

IN (irl ASSIGN. HD Z); 
(ir2 ASSIGN. TL Z); 
Z 
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Extending the alternate notation for recursive definitions 
given earlier, an expression of type (l) will be alternately 
written 

LET REC Hl,H2*XXl,el,XX2.e2 
IN s 

and further rewritten 

LET REC Hl(Xl),H2(X2)»el,e2 
IN s 

More generally, if HI, H2, ... , Hk are declared variables, 
arrays, switches, or procedure identifiers whose target lan- 
guage translations are the expressions tl, t2, ... , tic, and s 
is the target language translation of the a statement, an 
expression of the form 

(2) (Xir.(ABl.(XH2...(Hk.» (1st it ) ) (2nd * ) ) ... (kth * ) ) 
(Y* XHl.XH2...XHk.gk + ... + t2 + tg)) 



where 



and 



1st * * (HD it) 

2nd * » (HD (TL it ) > 

•' 

•' 

kth it » (HD (TL (TL ... *)...)) 

Y* » XF. LET wl^*.,.,*** 1 **,^' 'A 1 

IB LET Z*(...((F tl) w2) ,.. itk) 
IH (wl ASSIOT. (HD Z))} 

(*2 ASSIOH. (HD (TL Z))); 

(trk ASSIOH. (HD (TL (TL .. ir ) . . ) ) ; 
Z 



if Hi, l£i<k, is a procedure definition of J variables 

aJU y A;£ y • • • 9 ***v 

then the expression ti is given as XX1.XX2. . . XXk.ei, where 
ei is the target language translation of the procedure 
body, 
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will correctly define the (possibly simultaneous recursive) 
definitions in s. 

Further extending the alternate notation for k simul- 
taneous recursive definitions, an expression in the target 
language of form (2) will alternately be written 



LET REC Hl,H2,...,Hk=tl,t2,...,tk 
IN s 



Furthermore, if Hi, l<i<k, is a procedure definition of J 
variables XI , X2 , . . . , Xj , then Hi and ti will be given as 
Hi(xi,X2, . . . ,Xj) and ei , where ei is the target language 
translation of the procedure body. 
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For Statements: 

Consider the following ALGOL/60 for statement: 

(1) FOR X:=l, 2 STEP 2 UNTIL 7 DO X:=X+1 

Here, since the control variahle is itself updated in the 
statement "X:=X+l", the statement "X:=X+l" is evaluated only 
three times, for the values of the control variable "X" equal 
to "1", "2" and "5". The critical point in this evaluation 
is that the increment for the control variahle "X" is delayed 
until the statement following the "DO" is executed, possibly 
changing the current value of the control variable. Similarly, 
the evaluation of a for statement of the form 

(2) FOR X:=Q, U STEP V UNTIL W DO s; 

where "s" is some statement, can result in changing the 
values of "X", "U", "V", or "¥" before each iteration of the 
statement. The delay in the evaluation of for list elements 
is handled through the use of dummy variable functions. For 
example, consider the following function definitions: 

REC STEP(A,B,C) = LET A'.B'.C = (A 'A'),(B 'A»),(C 'A') 

IN (B^OAfC^A') =» 'A' 
(B'<0)A(A»<C» ) => 'A' 
ELSE =& TA' Air. (STEP(Xw, 

^ U + ( + (A',B')), B.C))] 
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n 



REC DELAY_CAT L » LET H,T - HD L, TL L 

II LET H 1 ■ (H »A«) 

IE (T » »A») =? fl< 

(H« ■ 'A') :=> (DELAY CAT T) 
ELSE =^CH' + t3 "" 



REC FOR(V,L,S) » LET H,T - HD L, TL L 

IE (L - »A») ^ »A» 

ELSE =^ V :- H; (S »A« ); 

FOR(V, (DELAY_CAT T),S) 



and the following target language translation of the for state- 
ment (2) 

FOR(X,(DELAY_CAT § 1 r.Q + Xw. <STEP{X*.U, Att . V, Xir.ff) ) }, s») • 

Here the function " DELAY_CAT" , when applied to the list of 
dummy variable functions in a for li»t, produces (a) the null 
string or (b) the evaluation of the next element in the for 
list followed by the dummy variable functions representing 
the remaining elements in the for list. The function "FOR" 
successively evaluates the statement within the for statement 
for each of the successively computed elements in the for list. 

The semantic constructs in ALGOL/60 are similar to those 
in many other computer languages for performing numerical 
calculations, e.g., FORTRAH, MAD, AED as* portions of PL/1. 
The semantic constructs in SB0B0L/1, defined in the previous 
chapter, appear i» part in several languages for string 
manipulation, e.g., PAK0I/1B, TRAC and CONVERT. The charac- 
terization of certain important linguistic features, like 

•s» represents the target language translation of the source 
language statement a. 
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structures in PL/1 and AMBIT/G and real-time operations in 
PL/1, has not yet been attempted with the target language 
presented in this dissertation. I suspect that the delay 
feature in evaluating target language expressions will prove 
useful in defining real-time operations and that modifications 
to the target language will be needed to characterize conven- 
iently operations on structured data. Nevertheless, the 
characterization of SHOBOL/l and ALOOL/60 have provided 
significant tests of the target language in defining semantics, 
and it is expected that future research will yield modifica- 
tions and extensions of the concepts presented here to define 
more varied computer languages. 

Since the discussion in this chapter has focused on a 
simplified exposition of certain constructs in ALGOL/60, the 
remainder of this chapter will be devoted to a detailed 
explanation of the complete formal definition of ALGOL/60, 
as given in Appendix h. 

Two Abbreviations for the Canonical Systems of ALGOL/60; * 

Besides the abbreviations introduced earlier, two abbre- 
viations have been added to the notation for canonical systems 
in writing the canonical systems for ALGOL/60. The first of 
these abbreviations allows the user to abbreviate construc- 
tions defining an alternating sequence of two other 



•The remaining portions of this chapter are for those who wish 
to study in detail the formal definition of ALGOL/60 given in 
Appendix k. 
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constructions (for example, defining a "for list," which con- 
sists of a sequence of for list elements each separated by a 
comma). Examples of the variants of this abbreviation are 
given in examples 7 in the table on the following page. The 
formal definition of this abbreviation is given in productions 
21 of Appendix 1.3. 

The second of these abbreviations generally allows the 
user to use a slash to abbreviate productions that are re- 
peated for each of the constructions defining real, integer, 
and boolean quantities in ALGOL/60. An example of the use 
of this abbreviation is given in example 8 in the table on 
the following page. The formal definition of this abbrevia- 
tion is given in productions 22 of Appendix 1.3. 

Notes on the Cano nical System Defining the Syntax of ALGOL/60: 

Predicates Needed to Specify Context-Sensitive Requirements: 

To specify the context-sensitive requirements on the 
syntax of ALGOL/60, a number of additional predicates (S31 
through SUl) are used. The predicate "TYPE" (S31.1) defines 
a set of three members, the strings "REAL", "INTEGER", and 
"BOOLEAN". The predicate "DIMM" defines a set consisting of 
strings of ones, where the number of ones in a string gives 
the dimension of an arr*ay. The predicate "SPEC" defines a 
set of strings, where each string specifies the use of some 
formal parameter in a procedure declaration. The predicate 
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"SPEC LIST" defines a set where each member is a string of 
parameter specifications each separated by a comma. For 
example, if "P" is a declared procedure with two formal para- 
meters "X" and "A", and "X« is used as a real variable and 
"A" is used as an integer array of dimension three, the speci- 
fication list for the occurrence of the procedure declaration 
is "REAL, INTEGER ARRAY(lll )" . 

The predicate "SPEC1 :SPEC2: COMB" (S33) defines a set of 
triples, where the first element is a parameter specification 
designating some use of a formal parameter, the second element 
is a parameter specification designating some other compatible 
use of the parameter, and the third element the parameter 
specification designating their combined use. For example, 
if the formal parameter "X" were used in three contexts, as 
a real variable in an arithmetic expression, as a real vari- 
able in a subscript list, and as a real variable that is 
assigned a value in an assignment statement, the following 
triples could be generated 

<A:REAL:REAL> <REAL:REAL:REAL> <REAL :ASONED:REAL ASGNED> 

designating the combined use of "X" as a "REAL ASGNED" vari- 
able. Note that if X is used both as a real and a boolean 
variable, there is no way to combine the specifications "REAL" 
and "BOOLEAN" to obtain the specification of the combined use 
of"X". In the generation of legal programs, the use of this 
predicate prevents the generation of illegal procedure 
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declarations containing such incompatible uses of formal 
parameters. 

The predicate "SPEC MATCH" (S3U) defines a set of ordered 
pairs, where the first element is the parameter specification 
of an actual parameter, and the second element is a compatible 
parameter specification of the corresponding formal parameter. 
The predicate "SPEC LIST MATCH" augments this set to include 
lists of parameter specifications. For example, if "P" is a 
procedure as defined above and "Q" is a declared integer 
array of dimension three, the function designators "P(3.1,Q)" 
and "P(TRUE,Q)" would have specification lists "ARITH EXP, 
INTEGER ARRAY (111)" and "BOOLEAN EXP, INTEGER ARRAY(lll)". 
The specification list "REAL, INTEGER ARRAY(lll)" would match 
the specification list "ARITH EXP, INTEGER ARRAY(lll)" hut 
would not match the specification list "BOOL EXP, INTEGER 
ARRAY(lll)". Thus the use of this predicate prevents the 
use of incompatible formal and actual parameters. 

The predicate "USES:PARS WITH SPECS" (S35) defines a 
set of ordered pairs, where the first element of each pair 
contains several lists of formal parameters with each list fol- 
lowed by a parameter specification enclosed in parentheses* 
(e.g., "X,Y,Z,(REAL) A(lll) ,B(llll ), (BOOLEAN ARRAY))", and 



•if the formal parameter is an array identifier, the identi- 
fier may be followed by the dimension of its subscript list; 
if the formal parameter is a procedure identifier, the 
identifier may be followed by the specification list for 
its actual parameters. 



155 



the second element contains the list of formal parameters 
with each formal parameter followed by its parameter specifi- 
cation (e.g., "X REAL.Y REAL, A BOOLEAN ARRAY(lll),B BOOLEAN 
ARRAY(llll)"). The predicate "PARS :USES .-SPECS" defines a 
set of triples, where the first element is a list of formal 
parameters (e.g., "X.Y.A.B"), the second element is a list 
of the uses of the parameters (e.g., " X REAL.Y REAL, A BOOLEAN 
ARRAY(111),B BOOLEAN ARRAY(llll)" ), and the third element 
the parameter specification list for the parameters (e.g., 
"REAL, REAL, BOOLEAN ARRAY(lll ) .BOOLEAN ARRAY(llll)" ). This 
predicate is used to generate the specification list for the 
formal parameters in a procedure declaration. 

The predicate "ENTRY" (S36) defines the set of elements 
that can occur as auxiliary lists in the canonic system for 
ALGOL/60. An entry is either an identifier, or an array 
identifier followed by the dimension of the subscript list 
given with the array identifier, or a procedure identifier 
followed by the specification list of the actual parameters 
given with the procedure identifier. The predicates "DIFF 
CHAR", "DIFF STR", "DIFF ENTRY", "IN", "NOT IN", "NOT CONT" , 
"DIFF ENTRY LIST", "DISJ ENTRY LIST", "LI :L2 : INTERSEC" and 
*'L1:L2:REL COMP" are, similar to those given for SN0B0L/1. 
One important exception in the similarity for the ALGOL/60 
predicates and the SNOBOL/l predicates occurs in the defini- 
tion of the predicate "IN" (S38.1). An entry is considered 
to be contained in a list of other entries only if the 



156 



dimension of an array identifier or the specification list 
of a procedure identifier matches each of the dimensions of 
other identical array identifiers or the specification lists 
of other identical procedure identifiers. 

Specification of the Context-Sensitive Requirements: 

In general, the context-sensitive requirements on the 
syntax of ALGOL/60 are specified by specifying a number of 
auxiliary lists with each syntactic unit and later specifying 
that each of these lists has certain properties. The lists 
specify (a) the identifiers declared as real, integer, boolean, 
or svitch variables (S2U and S26.2), (b) the identifiers 
used as real, integer, boolean, or switch variables (S8.3, 
S9.1 and S12.2), (c) the identifiers declared as real, integer, 
or boolean arrays (S25.9 and S25.1C-), (d) the identifiers 
used as real, integer, or boolean arrays (S8.U and S9-3) 
(e) the identifiers declared as real, integer, boolean, or 
non-valued procedures (S27.12) (f) the identifiers used as 
real, integer, boolean, and non-valued procedures (S9.2, S9.9 
and S9.10) (g) the labels* (S20.2 and S21.3) and label refer- 
ences (S12.1), (h) the procedure identifiers and variables 



•Leading zeros in a numeric label do not effect the value of 
the label. For example, the strings "001U9", "0lU9", and 
"11*9" each denote the label with value w lU9 M . Thus, a label 
is defined (Sk) in the canonfcal system by a set of ordered 
pairs, where the first element is a label and the second 
element is its value. The auxiliary lists of labels and 
label references contain the values of each label string. 
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that are assigned a value in an assignment statement (S18.1 
and Sl8. 2), and (i) the variables used in the arithmetic 
expressions in an array declaration (S25.1). 

The specification of the restrictions on each of these 
lists is complicated. The lists of formal parameters, para- 
meters called by value, and labels in a procedure declaration 
must contain identifiers each of vhich a different (predicate 
"DIFF ENTRY LIST" in S27.12). The lists of formal parameters 
used as real, integer, boolean and switch variables, the lists 
of formal parameters used as real, integer, and boolean arrays, 
the lists of formal parameters used as real, integer, boolean 
and non-valued procedures, the lists of formal parameters 
used to reference labels, and the lists of assigned procedure 
identifiers must each be disjoint (predicate "DISJ ENTRY 
LISTS" in S27.12). The lists of declared identifiers and 
labels in a block must each contain different identifiers 
(predicate "DIFF ENTRY LIST" in S29). The lists of identi- 
fiers used as variables, arrays, procedures, and labels must 
each be disjoint (predicate "DISJ ENTRY LISTS" in S29). 

The lists of identifiers used in a procedure declaration 
but not specified as formal parameters (the primed variables 
in S27.12), the lists of identifiers used in a block but not 
declared in the block (the double primed variables in S29), 
and the lists of identifiers used in the bound pair list of 
an array declaration (the variables with a subscript "m" in 
S29) must be obtained and specified as used identifiers in 
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the procedure declaration or block. Furthermore, with each 
declaration (S25.M or use (S8.lt and 89. 3) of an array identi- 
fier, the dimension m of the associated bound pair list or 
subscript list is kept with the identifier in the auxiliary 
lists of declared and used arrays. Similarly, with each 
procedure declaration (S27.12) and function designator (S9.2, 
S9.9 and S9.10), the specification list x of the formal or 
actual parameters is kept with the identifier in the auxiliary 
lists of declared and used procedures. The specification list 
for a procedure declaration is obtained through the predicate 
"PARS: USES :SPECS" discussed earlier. The restrictions that 
the dimension of each use of an array identifier must match 
its declared dimension and that the actual and formal para- 
meter lists must be compatible are specified through the 
predicates "PARS : SPECS : USES" , "L1:L2:REL COMP" and "L1:L2 
:INTERSEC" as discussed earlier. 

Finally, a string is defined as a syntactically legal 
program only if the lists of used but not declared variables, 
arrays, procedures, labels, label references, and assigned 
procedure identifiers are each given as null (S30.3). 

Notes on the Canonical System Specifying the Translation 

of algol7qT> 

Three additional predicates (Tfc2) are used in the specifi- 
cation of the translation of ALGQL/60 into the target language. 
The predicates "LIST:CORR HULL LIST", "LIST:CORR UHSHARE LIST", 
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and "LIST:CORR INDEXED LIST" define sets of ordered pairs 
where the first element of each pair is a list of identifiers 
(e.g., "X,Y,Z,") and the second element of each pair is 
respectively (a) the corresponding list of null strings (e.g., 
"'A* , 'A' , 'A' ,")* (b) the corresponding list of expressions 
applying the function "UNSHARE" to each identifier (e.g., 
"(UNSHARE (X »A» )), (UNSHARE (Y ' A ' ) ) , (UNSHARE (Z (Y 'A'),", 
and (c) the corresponding list of identifiers each followed 
by a "#" and a positive integer (e.g., "X#1,Y#1,Z#1 ," ) . 



•In the target language these lists are used in expressions 
like "LET X,Y,Z, = 'A','A»,»A', IN ...". Strictly speak- 
ing, the last comma in each list should be removed. 
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CHAPTER VI 



DISCUSSION 



This thesis describes a formal system for defining the 
rules for writing programs in a computer language and for 
defining what these programs mean. The author strove for 
simplicity of the formal system, and then applied the formal 
system to define two complete computer languages, ALGOL/60 
and SN0B0L/1. 

Besides simplicity, such attendant qualities like 
naturalness, perspicuity, and communicativeness have been 
accorded due allowance. Necessarily, I have used my personal 
discretion in weighing these qualities. It is inevitable 
that further research will refine the optimal balance of 
these qualities. Admittedly, there exists no known metrics 
for measuring these qualities precisely. They are subject 
to a latitude of interpretations. This fact should not be 
surprising. Indeed, almost every computer language has at 
least the theoretical capability of defining any computable 
algorithm. Why so many computer languages? It is more 
natural or more concise to define an algorithm in one lan- 
guage than another 

Canonical systems were used here to define the syntacti- 
cally legal strings in a computer language and the transla- 
tion of the legal strings into strings in some other language, 
Not once was it necessary to step outside the formalism to 
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define the syntax or translation of a language. Although 
some complexity was added to the formalism by introducing 
abbreviations to the basic notation, even the abbreviations 
were ultimately defined in terms of the basic formalism. 

Extended Markov algorithms and the X-calculus 

were used as a basis for defining semantics. Prior to this 
effort, work has been done by others in using formalisms 
like recursive function theory, Markov algorithms, formal 
graph theory, and the X-calculus to characterize computational 
processes. However, the marriage of extended Markov algo- 
rithms to the X-calculus is to my knowledge the first attempt 
where two formalisms have been intimately combined to charac- 
terize computational processes. AlmoBt every construction 
in SN0B0L/1 and ALGOL/60 was solely within the combined 
formalism. The introduction of new expressions to the 
combined formalism to mirror the assignment and transfer of 
control constructions in SH0B0L/1 and ALGOL/60 appeared un- 
avoidable. Nevertheless, these additions accomplished com- 
plete definitions of the semantics of both languages. More- 
over, the entire target language was eventually defined by 
an extended Markov algorithm defining a machine for evaluating 
strings in the target language. 

The extended Markov algorithm definition of the target 
language evaluator not only reduced the definitions of , 
semantics to a single formalism, but also demonstrated that 
a computer possessing only the characteristics needed to 
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evaluate an extended Markov algorithm ia sufficient to 
execute source language programs translated into the target 
language. The conventional machine facilities existing in 
most computers, like those for performing arithmetic and 
logical operations and those for transferring control within 
a program, are not needed to evaluate target language pro- 
grams, although they may be convenient. On the other hand, 
such horribly detailed machine facilities, like those for 
shifting hits or branching on the setting of a mask^ appear 
to be useless in evaluating target language programs. The 
ability to use extended Markov algorithms as the basic 
evaluating mechanism for computational processes suggests that 
machine languages quite different from those conventionally 
used might be more effective for defining computational 
processes. However, thia subject is, at least, worth aijother 
doctoral dissertation. 

One may well ask; Why was one formalism, canonical 
systems, used to define the syntax and translation of a» lan- 
guage? Why waia another pair of formalisms, extended" Markov 
algorithms and the Jk-ealculua* need to define" tba seiSeatle* 
of a language t And why wae Just eist ended Markov algorithms 
used to define the target lajagu^g* evaluatorT The following 
are my answers. First, it appears convenient to define the 
syntax an* translation of a language with a ^j§#r*-£i**> grammar 
(whieh canonical ay sterna proyi4e) tfcat freei the language 
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designer from the details of specifying a scanning algorithm 
for determining whether a source language string is accept- 
able. Second, a computer language generally specifies some 
veil-defined algorithm for performing a computation, and 
hence it seems somewhat natural to define the semantics of 
a computer language with some simpler algorithmic formalisms 
(like extended Markov algorithms and the X-calculus). 
Third, extended Markov algorithms alone were sufficient to 
define the target language evaluator. Fourth, the considera- 
tions of naturalness and perspecuity arise again. The 
formalism of canonical systems seemed Well-suited to define 
the syntax and translation of a language, the combined forma- 
lism of extended Markov algorithms and the A-calculus 
readily lent themselves to defining what a language means, 
and extended Markov algorithms provided the desired concise 
definition for the target language evaluator. In short, 
different formalisms model different processes with different 
degrees of complexity. 

I have attempted to separate the specification of the syntax 
and semantics of a language into three parts: (1) the specification 
of the legal strings in a language, (2) the specification of the transla- 
tion of the legal strings into the target language, and (3) the specifica- 
tion of the primitive functions used in the target language. Although 
each of these specifications must depend on the others for their Cor- 
rectness, the specification of the primitive functions in the target 
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language were written for the most part after the specification of the 
translation of the source language into the target language and re- 
sulted in few changes to the definition of translation. On the other 
hand, it is unfortunate that the specifications of the syntax and transla- 
tion depended heavily on each other. A change in the specification of 
the syntax often required a change in the specification of the transla- 
tion, and vice versa. It would certainly be valuable to develop a con- 
vention that would better isolate the specification of the syntax and 
translation. 

Although the semantics of a source language was formally 
defined here hy the target language, and although canonical 
systems specify only the syntax of a language, a large portion 
of the semantics of the source language was somewhat impercep- 
tively defined in the canonical system defining only the syntax 
of the language. By using descriptive predicate names like 
"ARITH EXP", "COND STM W , and "LABEL", a correspondence with 
the English language was made to aid the reader's understand- 
ing of what was being talked about, i.e., the semantics of 
the constructions being defined. A similar use of the 
English language occurs in a Backus-Naur form specification 
of a computer language. The use of metalinquistic variables, 
like "ARITH EXP", "DIGIT", and "PRIMARY" in productions like 
"<ARITH EXP> :: = <DIGIT> | <PRIMARY>" , does convey some idea 
of what the specified strings mean, although strictly speaking 
the productions define only certain legal strings in a 
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language. In this way both canonical systems and Backus-Naur 
form make good uses of one of the most popular meta-languages , 
the English language. 

There are several immediate uses of the formal system 
presented here. First, when developing a language, it vould 
he desirable to have a formal definition specifying precisely 
vhat strings are allowed in the language and what the strings 
mean. Such a formal definition could he given to others for 
their analysis and vould sharpen the debate over whether the 
convenience of each construction in the language vould be 
worth the difficulty in explaining or implementing the con- 
struction. Second, after the designers agreed upon the con- 
structions in the language, the formal definition would be 
valuable to tfcoae implementing the language or those prepar- 
ing the language manuals in that the'y would know unambiguously 
what was intended by the language designer. 

The formal system presented here opens several avenues 
for future research. As previously mentioned, since canonical 
systems can define precisely both the syntax and translation 
of a language, canonical systems eight be used as the basis 
for automatic translation between computer languages. If an 
efficient algorithm eemld be developed to recognize strings 
specified by a canonical system and generate" their translation, 
a canonical system definition of a language could be imme- 
diately ueed to irafisla'te legal program* in the language into 
another language # Another ue-e of the formal system might be 
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in the implementation of "extensible" computer languages. 
By simply adding or changing the productions defining the 
syntax and semantics of a language, the new productions could 
be given to the algorithm for translating strings specified 
by a canonical system, thereby implementing the extended 
language. 

The author has attempted to integrate and adapt three 
known formalisms to define computer languages. These formalisms 
have been blended into a formal system for defining computer 
languages rigorously and somewhat concisely. The most signifi- 
cant portions of the attempt here are the application of 
canonical systems, the marriage of extended Markov algorithms 
with the X-calculuB, and the application of extended 
Markov algorithms to define an evaluator for the target lan- 
guage. It is hoped that this work is a progressive step in 
achieving the thesis of this dissertation, to meet the need 
for formal methods for completely defining computer languages. 
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2.1 


VAR<A> 




C, 


C 2 


3.1 


PRINARTtrAM<llA> 




c k 


C 3 


3.3 


ARITR IZF:VARS<1:A> 




c. 


v e . 


3.5 


STNiVAM<Al«l l A,> 




C 6 




k.l 


TYPt LIST<A> 




c. 


c 6 


a. a 


DIC : DIC VAR8< IITIOIR AlA,> 




c g 




6.1 


IKA.:A,> 




c ' 


Cj.c 7 ,c B 


5. 


PROORAIKllaXR IITIOIR At A 


l-l DD> 



(b) D.rir.tJoo of • ajratactlcalljr 1«(>1 procraa ana It* truaUtlo* lata 

aaaaablar language. 



I Product ioa 
from 
i App. 2. la 



1.1 
2.1 
3.1 
3.3 
3.5 

a.l 
k.k 

6.1 

5. 



Coaclualoe at«a« to tarlratloa 



DI0IT<1> 

VAA<A> 

PRIMARTlTARS<l..>r'l' t<> 

ARITI IXPlTARI<l.. I 1,-P'l' aLOAB llA> 

STMlVA*8<A>>l.. I l.-fl 1 *LOAfi 1 

ST l.A **TORI USULT II AlA.> 

TTPI L1ST«A..A M »> 
DICiDtC YAAA< IITIOIR A. .A M PiA,» 
II<A,:A,> 

PROOAAN«IIOII IITIOIR A| Al-1 RIB.. 

•A8IIMBMR UROUAOI PROORAN 

RALR 15. *SIT Mil RROIITU 

USII6 *,15 *»PORM AMROTUR 

L l.-fl- »10A» 1 

■T l.A •STORI IMIIU II A 

STC •RITORI TO IDPIRTIROR 

•tTOIAOl POR TARTARIC* 

AM P 

!■»> 
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I 2*1 CA101ICAI. 8T8THI 8MCIFH10 til TRA18LATI01 OF 
Til ALOOL/60 1CT1IT IRTO Til WWW LA1WAW 



3.1 
3.2 
3.3 

3.* 
3.5 

k.l 
k.2 



P1INART 

uin izp 



Dion«d> <rinn!<4..«'>i 

»*»<r> . rniiini<i. .t> » 

PRIHART<p..p'> ♦ ARITI IXP<p..p'>l 

PRXNART<p..p'>, A1IT1 !»«•..»•>» AIR! IXP<»*pt (♦(•' ,p" ) )> 1 
Ultl !!?•<«..»•>, TA1<T> -» BTIK»f«..(Y A88I0I. »')>; 

ITU LIIT<A.. , * , »,<1.. , » , »,<A,R..'* , ,'A , >1 

tin him..i'> ■» nt'innn i..i>i'>s 



S. P10MAM IW<|..|'>, DBC<1..1'» - PBOCIAXUMI 4f MD..HT A' II ■'>> 



Appuolz 2.2 D1PIIIT101 OP PRIITITITR PMCTI011 POR 8U18IT 
8«t definition* for itrlm T»rl«»l«» | r,( c SIR | 

chap. Diamo>,<i>. ... ,<9>t 

t«TT««<A>.<»> <X>| 

IIARI <,»,«• >, ... , < > ; 

DIGIT <p> | LKTIR<p> | HA*K<p > ♦ CRAR<p> ; 

STR »TR < »>i 

STR<«>, CIAR<e> ■» STR<se>| 

Poflnltlon of arl«ltlT« fitactlont 

CAT > - [ .. — "lA-"."]"] • 

■»«•••> ■ i ill: :: ;.£■ ] •" 
«»>(,....) - [«"s t - ; ] 



8UCC ■ • 



• /a. 
>/l. 


♦• 


TRUI 
PAL8E 


T1U1 
PAL8I 


- 


« 


/•0/r. 
/■1/r. 


Z'. 


• lr 

• 2r 


/•8/r. 
/■9/r. 
//r. 


* 


• »r 

/•/Or 
lr 


/O/. 

/l/9r. 

/•0/r. 

/•1/r. 

/•2/r. 


.. 




»r 

/•/9r 

• Or 

• lr 



/•»/r. 



• «r 



/•/ 



/•/ 



RXC »(X,Y) ■ 



•0.(1, '0') •* X 

ILS1 «* «0N(8VCC X, PR1D I) 
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Appmalx 8.3 DIPHITIOM OF AI ITAtPAtOl TOP. tl» TA1QIT LAI0UA8I 

(•) Sat tcflaltloaa for striae Tarlatlan | r.r' ,«,•• ,»j ,/,,... ,x,,r, t ST* | 
l.T, TAIIAPLI | p,p' c PTI | i.J.k « IIDIX j k,k' I W It | t,t» I ID (L | 
k ( « IN ID I t ( c S*Q TL | «,«.' c LA1IL ST1 | 



DIOII 

Lirrn 

MASK 

CIAI 


DIOII<0>,<1>, ... ,<9>i 
URB<t>,<l>, ... ,<I>,<o>, <»»,«w>i 
MAMC<S>,<«>, ... .«]>! 

bioit<p> | uma<p> | mabk<p> * ctAi<p>t 


STR 


ciai<«> * sn ciako, <»>,<.>.«<>,<)>, 0>, <" !'>t 

an<>>i 

sn<s>. sn eui«e> •» sn ciai<»>i 


C0I8TAIT 
TUIUU 


STI<» - COMTAI*<'a'>; 
CIAl<e> » VAIIAMI<SIQ.(a)>| 


m 

IIDIX 


pn<i>» 

pn<p> ♦ pri<ip> i 

DIOIT<«> * IIDIZ<SN(«)>t 


urn sn 


LAUL sn<A> ( 

LA1IL m<l>, TAIIABII<t> ♦ LABIL STRolN^i 


ixp 


COI8TAIT<p> | TAIIA*U<P> - KP<p>) 
■XP«>.<f>, IIDH<i> » IXf<( 1 * f)>i 
TA1IABLI<T>. IXP<*>, IIBBX<1> ♦ BXP<k?T.»s 
TAP.IA1LI<T> , IZP<<>. IIDM<1> * IZP<( i T ASSIOI. «)>j 
8S«<a> ■» IXP<I>, 
IXP<«>, IIDII<i> » IXP<( t 0OT0. •)>» 


«w 


HMM>,<l>,<k> * M,t.t,l.( k l «)>t 
tZP<«>, Kt>, IIDlX<t>,<J>,<k> ♦. ■»«*{. (*t <) t.w.t)>t 
SM<». YA*IABLI<t> . M«<rh»| * 
SM«>, T<t>. IXP<a>, IIDM«1>,<J>,«M> » SIQ<( 1 (,« •) t k «.t)>t 


■ZP ID 
IIP »L 

sie. id, ssq ti 


COISTAIT<e>. TAtIABLI<T>, IIMX<1> ■» IXP ID<c>, <t>, <(•>,<».> , 

«T ASSIOI. >,<SOTO.tl 
BXP<M>, IXP ID<k> * HP Il<t>l 
VAIXABH<»> ( EXP<», SK<k *V%k t > * BIQ ID<k >, SK W<k.>» 
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(b) Substitution rule* 



lalt 
Str 



b | •Control 

'l 

(1*1 w»l) ainTironasent 

(1, ) »Stor« 

ht •I»pr«n*lon 



• I 



Evaluate 
Variable 



• t ■ I 



Evaluate 
Label Ref 



J° 



(».•> 



APPLY. 
P P' 

u 
(p.Vj> 



Evaluate 
Coaatant 



IP 
(lp.r)l 



Apply 

x-eap 



00T0. APPLY. r| , 

P. I' 



(lk-1 v>p' ) 

I 



« I, 



Apply 
Ooto 



(,ht h't* ) 



Evaluate 
Comb 



h" h APPLY 



V 




A 


4.1 


(J'-l »-P> 


Tea 

have It 







(J-i »"p) 



Ip 

J -a 



» 1 Y.a i qln i ht ti 



ASSIOI. APPLY. 
P P' 

(p.rlalp'.r'l 



APPLY . 
P P' 



(p.rMp'.r* ) 



P I 



Apply 
Aaeign 



(p.r'Wp.r' 
f 



Apply 

Conetent 



i' ', 
A 

(P.>) 



J**i r i»k 



Evaluate 

X-«Mp 



Re, trj' 

scat any 



-J 



■■'j 
ip 

(lp.ljCjl! 



j-i r !•*» 







5 


2 






lo. 


try 


nasi 


•nv 



J-i r !••» 



<lp,r APPLY r' (1 
I 



AS5I0II. APPLY. 
P P' 

(p'.r'Mp.r) 



APPLY. 

P P' 

(p'.r* )a(p.p) 



Apply 

AaslgB 



* 






1 






J 


11. 






Salt 
rroa l-eap 




" 




I 



<p' ,r* )a(p,r* ) 



(P.r) 



Apply 
Constant 



K*turE 
Value 



(lp.r APPLI r')I 



175 



Appandla 3.1 CAEOEICAL 8T8TEM SPECIFTISO TEE StETAX Of SBOBOL/1 



1.1 
1.2 
1.3 

l.« 



3.1 
3.2 
3.3 
3.V 
3.5 

k.l 

k.2 

k.3 
• •• 
k.5 

5.1 
5.2 
5.3 

5.1 
5.5 

6.1 
6.2 
6.3 
6.1 
6.5 
6.6 
6.T 
6.8 



T. 

8. 

9. 

10. 

11.1 

11. » 

11.3 

12.1 
12.2 
12.3 

13.1 
13.2 
13.3 

Ik. 



15.1 
15.2 

16.1 
16.2 
16.3 

17.1 
IT. 2 
IT. 3 

n.h 

18.1 
13.2 

19.1 
19.2 

120.1 
20.2 
20.3 



DIGIT 

LITTER 

NARK 

BASIC SYMBOL 

STRIBO 

BANE 

STB BAN! 
TAB BANI 
BACK BIF BANI 

DIOIT STB 
IBT 
ARITB EXP 



STBIBO EXP 



ASSIGB BULE 
PAT HATCH RULE 
IBPUT BULE 
OUTPUT RULE 
RULE 

LABEL EXP 
STM 
STM SIQ 

SIOBOL 
PROOBAN 

BANE LIST 

sirr crar 

MPf STB 
DIPP BANE 

IB 

ROT IB 

ROT COBT 

DIPF BAME LIST 

L1:L2:IBTEBSEC 



DIOIT<0> ,<1> , ... ,<9>i 
LITTIB<A>,<B>, ... ,<Z>| 

MARt<<>, <.>,<-> </>i 

DIOIT<p> | LETTER<p> | NARK<p> 



BASIC 8TNB0L<p>| 



BASIC STMBOL<b> 



BTMIO<SIQ(b)>; 



DIOIT<p> | LITTEB<p> » BAME<p> 



SAM<a>.<a> 
EANE<B> 
IANE<b> 
IAMI<n> 

DIOIT<d> 

DIOIT STR<|> 

IBT<1> 

STB IAME<a> 

ARITB OPERAED<a>,<b> 

STRIBO EXP<A>; 
BTMRO<«> 
STR MR<1> 
ARITB EXP<B> 



* RANE<aa> t <B.B>l 

- STB BAMEiSTR Bl*t<a:a,> ,<*n:o,> s 

• TAB BAMtlTAB BEPS<a>B,>t 

- BACK BEP BAKE 1 BACK REFS<a:a,>| 

• DIOIT STB<SIQ(a)>| 
-• IBT<a>,<-a»j 

- ARITB OPEBABB< n l M >| 

♦ ARITB OPERABD<a>,<tB> 

- ARITB EXP<a«a>,<a.b>,<a*b> > <a/ti> 



- STBIRG EIP<«*»»>! 
♦ STBIBO EXP<a>t 

- STRIRC EIF<a>t 
STBIBO EXP«»>,<t> - STRIBO E*P<»qt>; 



BTBIBO<«> 

STR RAMI<b> 

TAR SAMt<a> 

TAB EAME<B> 

TAR IANS<a>, DIOIT STR«d> 

BACK REP RAME<B> 

PAT EEP<p>,«q> 



- PAT IXP«*l w M 

* PAT 1XP<B>| 

- PAT EX»«»»»> | 

- PAT !»<■(■)•> 

* PAT EPT«»B/d»>| 

- PAT EX*vb>; 

~ PAT IXP<pOq>i 



PAT SX»iSTR RBPSiTAR REP8:BACK REFS<Bir sr tr->, DIPT BAME LIST<r >, 

Ll:L2:IBTEBSEC<r b :r T :r )) >,«r t ir T :A> - pIttSrBiSTR BEPS:TAB REFS^pir^ir^- 1 

STR BAME<n>, STBIBO EXP<>> - ASSIGB BULI<a>>>; 

STR IAME<b>, STRIBO EXP<>>, PATTERE<p>» PAT MATCS RULE<ailp>i> | 

PATTERE«e> - IBPUT RULE«SIS .READ p>| 

STBIBO EXP<I> ♦ OUTPUT RULE<8TS .PRIST • >; 

ASSIGB RULE<r> | PAT MATCB BULE<r> | IBPUT BULI<r> I 

OUTPUT RULE<r> ♦ UBLABELED RULE<r>| 
UNLABELED BULE<r> - RULE<Or>| 

UBLABELED BULE<r>, BANE<B> - RULE t LABEL! < aQr : n, > s 

RAME<a> » LABEL EXP t LABEL REFKata.M 

STR *ANE<B> * LABEL EXP<*B> 

RULE<r>, LABEL EXP<l»,«a> » 3TN<r> ,<r/<l )> ,<r/S( t )>.<r/S( I )F(a)> ,<r/F(a)> ,<r/P(a)S(l )> t 



STN<I> 

STM SEO.<i>, STM<» 

STM SEQ<q>, STRIRG<i> 



- STM 8EQ<a>| 

- STM SEQ< <,}•>; 

•* STM SEQ<q* fl a>,« a a*q>; 



STM SEO.: LABELS i LABEL BEPSc«:(:( r >, RAME<», DIPT BAME LI8T«E»D,1>, 
LllL2:IITERSEC<tBD,t:a,l r :MD,l> * SBOPOL »POORAM<a>HD B> ; 



BANE LIST<A>; 

BAME LI8T<t>, BANE<B> 



DIPP CBAR<AlB>,<AiC>, 
DIPP CBAB<x:jr>, 
SAME<B>,<a>, DIPF 



BANE LIST-n.lM 

CBAR STR<axt 1 >,<ara.> -• DIPP STR<a». iapa,> ) 
P 8TR<ata> * DITF'BAKE<aia>; * z 



BAME<a> - IB<a:a,>; 

IB<b:(>, EANE<a> * IE«a:a,(> ,<b: ta t > i 
BAME<a> ♦ EOT IB<a:A>| 

EOT IB<b:1>, DIPP BAME<aia> -» ROT IE<aia,l>| 



CRAR<c> 

BOT Cr>BT<c:»>, 



■• BOT COBT<etA>| 
DIPP CEAR<e:d> - BOT CORT<eiad>| 



DIPF BANE LIST<A>; 

DIPP BANE LIST<1>, BANE<a>, BOT IKniO - DIPP BAME LIST<n,l>i 

BANE LIST<1> < Ll:L2:I»TER8EC<Ail!A>; 

Ll:L2!lBTEBSEC<l 1 tt,:l>, RAME<a>, IB<Btl-> ■» LI tL2tIETEBSEC <»,l, 1 1 :al>i 

LlsL2:IBTERSEC<lJ:lJ:l>. IAMZ<8> . BOT IB*B.lg> » LI |L2| IBTEBSEC 'a.lj , .J, 1 » , 
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App.ndlE 3.2 CAlOHCAt SYSTEM SPICirTIHO tit MAMLATIOl 

or SBOBOL/1 I1TO Tit TABOIT LA10UA0E 



3.3 

k.3 
b.l> 
k.5 



5.1 
5.2 

5.3 

5.1 
5.5 

6.1 
6.2 
6.3 
6.1. 
6.5 
6.6 
6.7 
6.8 

T. 



9. 
10. 



11.2 
11.3 

12.1 
12.2 
12.3 



ST* IANI 
ARIT1 EXP 



STRIHG. EXP 



ASSIOR RULE 
PAT MATCH RULE 

IRPUT RULE 

OUTPUT RULE 

PULE 

LABEL EXP 
ST* 



13.1 


a 


13.2 




13.3 




Ik. 


s 


22.1 


i 


22.2 


i 


22.3 





■AMI<n> - STR «AKE<n..n>,<tn.. (LOOKUP. n)>; 

IRT«i> * ARITH OPlRA»I><''i , '..*i , >s 

STP »AME<n..n , > - APIT1 0MRA1»< a . . ■ • > I 

ARITH OP1RA»D<«..» , >.<0..» , > - ARITH !»<»♦». .(♦(•"•.»• ))>. 



.STRIHO EXP<A 
BTRIBG<>> 
STB RAME«n..n' 



•A'>; 



STRISO BXP<*V\. 
STPIRO XXP<n..n' 
. STR1I0 EXP««..»* 



APITH EXP<«..>'> • STR1I0 EXP««..»'>; 

STlIin EIP<»..n'>,«t..t'> » STRIBO EXP«»Ot..((CAT ••) t*)>; 



DlSTR |>! 

: bcBAL STR l>i 

: <»,«)ePIX L« STR 



STRIlff«.> * PAT IXP«"V , ..'i , >l 

STR RAHE<n..n'> » PAT BXP«n. .n'> j 

VAR RAME<n> - PAT EXP:SP1CS«»»». . 'n 

VAR 1AME«B> - Mt HPlS»«CS«»(»>».. 

VAR «AMl<n>, PIOIT STH<«> -» PAT Mr'lSFICS<*a/« a .. 

BACK REP RANE<a> * PAT EXP«n. . '«•> l 

PAT IXF<n..a , >,<q...q. l > - PAT EXP<sO<|..((CAT »■• J l')>i 

PAT EIF<i>..l>'> -> PATTERH<p..»'>l 

STR HAHE<n..B'>, STP «XP<«..i'> - ASSI01 RULE<a-»..(a ASSI01. •■')>; 

STR IAHE<n..n'>, STP EXP<«..« , >. PATTERR: SPECS: VAR RtFS<p. .p' :c: »> 

♦ PAT MATCH RULE<BUl>-•..(MATCH_AHt>_ASSIr,R(n•, p*. »!.••, 'e', '(t) >>l 



SS010L PROORAM 



LIST:BVS:CORR 
1ULL LIST 



PATTERR: SPECS! VAR REPS<p.. p':c:t> 

- IRPUT RULE'STS .READ p. . <MATCH_AHD_ASSIOH(READER#. p 1 . »«. « . « 



STRIR". EXPO. 



(»)')>; 
.■. - OUTPUT RULE(STS .PRINT ...(PRIHTERf AWI-.l. ((CAT PRTBTEP.O ■•))> 



ASSIOR PULE<r..r"> I PAT HATCH ROLE«p..r'> I IRPUT RULE«r. . r" > 

| OUTPUT RULE<r..r*> •• UHLABELED RULE<r..r'>; 
UHLABELED RULE<r..r"> * RULE«0r. .r ■ > ; 

UILABELED RULE<r..r'>, HAME<n> - PULE'nOr.. n :P'>; 

BAMR<a> * LABEL EXP<a.. •■>; ,,.._ . . > n 

""»?««»..»■> * LABEL EIP< In.. UOOEUP. ((CAT ••) a ), 
RULE<r. .!••>. LABEL EXP<1. .!•».«■. .■'> - «"' r " r VTihiLi 
" (OOTO. t") ELSE * ■A , >,«r/«(l)P(«)..P* ^(OOTO. 

^ (OOTO. 



<!-/■(().. r* 

<r/r(»)..r' 



•A> ELSE * (OOTO. l')>,<r/r(n).(l>..r' .JfnoTO. 

^(OOTO. 
- STM 8«<i..«'»i 
STM-i.-.'^STM SEQ.ljL.l';'^; 
STRIRO<» ♦ STM SEO<o>»». .«'>,<•■*<]. .1 »■ 



><) IUI 

■ M>. 

I • ) ELS1 

a-)'-. 



STM<s. .•'» 
STM SE0<1..q 
STM SEQ< q . . 1 

STM SW.STR R«r.<„..,-:. >. ™ME<a> 1 «"^«'«»» .'^ ; LI "' 
. SROBOL M00RAM«jHR6 a. .LET r^-t IR (OOTO. n J , <■ . 



RANE<a> - IMTllMiCOlR'MLL LI8T«»i»l'» 
LIST:»VS:CORR HULL LIST<1:*:*>. EAME«B>, 

- LIST:BVS:CORR HULL LIST<1 ,»!b;«> ; 
LIST:BVS:COHR SULL IIST<« :■:»> . » A,,I '"* > 

» LIST:BVS:CORR HULL LIST- t,B:»:».' «'> 1 



Il<n:t> 
HOT IH<»!«> 
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App.a.l. I.) D1THITI01 Qf fKIHIIITI FOTCTIOIB FOR SSOROL/l 
8«t Hflnttlon. for .trig. rarlabl... | r,. STR I b.a SAL SIR | 



BTR 
■AL 8TR 

FIX LI 8TI 

■OT COIT 



DI0IT<0>,<1>, ... ,<»>! 

LITTIR<A>,<I>, ... ,<t>| 

«ARC<*>,<->, ... ,<?>> 

DIOIT<p> | LBTTIR<p> [ MARK<p> * CIAl<p>| 

STR«»>; 

STR<a>, CIAR<e> ■> STR<ae>| 

STR<a>, ROT CO»T<( !•>,<).•> * 1AL 8TI<i>i 

BAL 8TR<a>,<t> . BAL STt<(a)>,<at>| 

FIX LI 8TR<AtO>; 

FIX LI STR<aia>, SUCC<Bta>, CIAS<a> » FIX U 8TR«acta>; ' 

DIFF CIAI<A:B>,<A:C> <tl»>| 

CIAKe> » (of COIT<etl>! 

IOT C0*T<eia>, Birr CIAR<«la> » «0» COIT<01.4>| 

8I» OP IIII8iZIROS<9tO>| 

8TR OP IZII8l»IOS<ai7> * *Tt OF *IIUlIUK>8<a9l70> | 

***«»> * SBCC«a0lal>.<Bl!»8>, ... ,<a6ia9>| 

8TR<a>. STR OP IIH»l»M08<»iy> «■ 8PCC<aily> ,< t Oanlr>,<alataZr> . ... ,< > 8»n»j> 1 



(a) Macallaaaaaa ...1. »,i, t t<... 

CAT • • 

I«(a,B) - 
IIQ(a.l) • 
COIDd.a.l) ' 

AIB(a.B) - 



[ - 

[ :;.: 

f a/B. 
|_ a/B. 



TL a 



hot 

FALSI 

TRUI/TROI 
TRUt/FALSI 
FALSI/TMI 
PALM/FAUI 

(b.c) 

b 

(b.c) 
() 



(b) Arltfatlo .rlaltlT.. 

*" • [ ".: : 



IIOATI a 



18 P08 a - 



18 in a - 



[-:: 
[-:: 
[-:: 



/•0/r. 
/al/r. 

/a8/r. 

/.9/r. 

lit. 

I ill. 
/l/9r, 
/•O/r. 
/al/r. 
Imilr. 

/•9/r. 



FALSI 
TROI 

TROI 
FALSI 

• lr 

• 2r 

a9r 

/a/Or 
lr 


9r 

/•/9r 

a Or 

air 

• Sr 



"[A— " 

nut 

FALSI 

FALSI 

TROI 



TROI 
FALSI 
FALSI 
FALSI 

b 
A 



'• () 



/a/ 



/a/ 



.j.J 

] 
] 



a/B 
a/I 



•/* 
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ric i(x,r) ■ 

PIC SW(X,Y) - 
SIOI(X.Y) ■ 

LISSU.I) • 

bipp(x.y) • 

mc frod(x.y) • 
iic quot(x.y) ■ 
♦(x.r) • 

-(x.i) - 
•<x,y) • 
/tx.D - 



IQ(Y.'O') * X 

EMI * i(M» *. ">» D "> 

M(Y.'O') * X 

ILSI *» 8UII(SUCC X, MM X) 



AID (18 FOB X, I8_P08 I) 
AIB(I8~P08 X, I8_IIO T) 
A»D(I8~II0 X, I8_P08 T> 
IL8I 

IIO(i(Y,X), '0') 

UM(I,D * IIOATKi(I.X)) 
M(X.Y> * '?* , 

IL8I * i(X,I) 



•A' 
••• 

•A' 



iqd.'OM 

11.81 



SVM(X. 'DOB (X, FI» X)) 

li88(x.d $ ;s; {tl . t q00I (i(X , r) , ,)) 

AID(I8_P08 X. IS_POS TJ ^ "°"i?; T, „. .»• 

A»(i8.«o x. i._po. «) ~ «sSSiiJIStm I. *» t» 

♦(I, II0AT1 T) 

LIT 8-8101(1,1) II CAI(8, FIOB (AM X, AM I)> 

LIT 8-SIOKX.Y) W CAT(8, WOT (AM X. AM I)) 



(e) » B «lc »»tf r|f HMF 1 " *—« tloa 



MC ASM LIST(l.M) - Lit I.T -"»;•«.} 

" Si:' * LIT .1-(L00* W . I) I- <;i a f_»«- ( , ( ,"%L , ! I ! ,, 

HATCI_AIB_A88I0I(IAM. PAT, ST»_M?.SIT_BPEC»,TAM) 



LIT » • ( 



[I IH_8PIC8 1 

i PAT t. ». (TAM.(.).(t).) IAMI) 

,. A J 



" ,L8." ,A,> 5 Lirir^.-J • O «. * (« .). » (» <«• Ol 
II A80I_LIW(TAW,«1)« 

ST !afA.«0.!"T(S»((CA, .«) .«.«*»» .1). 
•TBOI' 



(d) D«flBlttO» or LOOIOP. * «■ >« »«•«« to wrtwWr 



LOOKUP. APPLY. 
P 

(P.«) 



T.l 



179 



1.1 
l.t 

1.9 
l.t 

(.1 
i.l 
t.l 
».» 
».5 
>.< 
2.T 

3.1 
3.! 

k.l 
».» 

k.J 

k.k 

5.1 
».» 
3.3 

(.1 
C.l 
«.3 
(.» 

7.1 
T.l 

1.1 
t.l 
1.3 

a.t 

9.1 
9.8 
9.3 
».» 
».J 
».• 
9.T 
9.1 

9.11 
9.11 

10.1 
10.2 
I0.» 
10.» 
10. J 
IO.« 
10.1 

11.1 

11.9 
11.1 
11. • 
11.9 
11. « 
11. T 
11. • 
11.9 
11. 1C 
11.11 



*"""» 4l WTtmip tTtTW mcimw tii nmi „, a^,^ 



nan 

unn 



110 tTI 

lit in 
id m 



» 101t« 0».«l> <<>, 

«ttn«A»,«l> «!»,«*••», €•»-», 

Dia:r<»> | un»i> I kam<>> 



♦'■->i 

CUUK»», 



en 



Ml BItIM 



UllliHl 



ud or 
■ok or 
■n or 

mm in 

oniai mi 



ID 
IDLIlt 



rci Dif 



?21i!l 4> * ••! in<iw(«)», 
S2'" * "* «■«•■*(■ >m 

iftt««i> ♦ i» m<i>i 

««i», umt<i> * ip in«u»i 

ip«i». DI«It<C ♦ IB «n«u. 

cui<» - in<inii)>, 

"I MIH<,»| 

lit m<l> •. Ml DILI*.). -,•(>, 

ID •?«<•> * LAIlliTAKit.., 

LA1H.|TAI<1|1>,<2|2> <9t9>> 

"»«.l»Al<li»>, DI0I»<«> V 

UlIliVAl«ii*>, BJ8IT ST»>1» - 



lAlllt?Al<t«iT4>| 
LAIU,iTAI<01it>| 



11.1 
11. t 
11.1 
11. » 
11.9 

19. 



It. 
19. 
1«. 

IT. 



•ooi in 



DM Mr 



■IT 



riOC 1TM 



AID 0r«»» t <->| 

mm or«i>, «*/*>,«»., 
in or<*«*>,«^>,<»» l <>> #< * 



«*>i 



inn m«t> 

HUT •»<■>, <t> 

■Mm iit«i> 
nun im«a> 



• VI1I0I !!(<•> | 

*• UMI0I HM««»,<.t> t <a.t>i 

♦ IIT<1> ,<♦!>, <-i., 

* 10M<B> ,«♦■>,<->> | 



id m<i> 



»<i>i 

I»LI«T<»LT»««(t 



,)>l 



»«» £:::. , 0MC „„ „„,„«„„., : ;zs»£ ssise:;:;:,., 



«iAi/iit/io»t vaih/i/i tam«i,i.., 

•lAL/Ilt/MOl Ml.l/I/I AMA»<l|t J, !(.),., 



™«.', ..«. ...- * * CT "I'liiein *AM<iii«tci.ii.>> 

22 IE"*' ♦ *er rA«nrtcf««itAiii,.| 

«r r««,. . mi uuiK* . act pa> »Ai*itn W (, «>., 

!!£!?"•'»•• """ °"» > * tnjKAtnntf ■)»« 
nwr.t». am or... . m. ««,AwJn(t'.) M 

•ia>u Aim iir< s > . Aim txr<i>i 

•ooi nr<t>. inru Aim iir«.». Aim n»<» ♦ Aim nr<ir t nn . mi .», 

looi rniHnoi>,<rAiii>i 

■nru urn nr<*>,€t> m or<r> . iiLAttoaort* . 

»ooi iir<k> ■• mi niK >i>, ' ' 



■ooi nnM»)>, 

• moi iie<f>,< p>, 

> mml rAe<unn(> a)>; 

• dooi nnKAini«(r v)>i 

■ 1001 IH><ALni«(t »)>i 

• uaru MOKAinttd •)>■ 

■ M0L 1IT'«», 

ItVU IO0L<» - 1001 ID'If . Till b IUI ,>, 



•ooi iir<k 
■ooi miHi 
•ooi 1IC<» 
•ooi Me<r> 

•001 THJI«t> 
•001 IM><1> 

uaru iooi< a > 
moi ■»<•>, <k> 

JjJS ,M i:;^*.„ * """" — "»>"iii nrio...., 
IH .i. i"™ """ * nm " "• , "» , » »«i«iuiii.M 
!f*-fP^ .„ * *""■* ■" »»«u>»i 

uaru m iir<» . m ur<», 

■ooi •»«»>. iiHrii m ■!!<■>, in nr«> - on i»<ir t nn . tuu «., 

Aiin »<•> | MML ID... | in | W «., . Ixr««>| 

"■«•>. in eoR.ii>> . nam in<com>iT •>, 

BU •»•«» .. »»» itb.00 to «,, 

rci in<r> , noc ,„,.„ 



180 



11.1 

U.2 
It. J 
18. A 

ia.5 
10. « 

19.1 

19.2 

19.3 
19. k 
19.5 



20.1 

20.? 

21.1 
21.2 

21.) 

22.1 
22.2 
22.3 



2*. I 
24.2 



25.1 



mot iin 



ro> m 



U1C01D Bin 



C0>D STK 



♦ i/i/i ttft put.m«» n»e im<iu. 

3S KSEE: fesffSR.-^ : ES US 85.:=:. 

I/I/l A.0T «■■>•> » *»« *»<•>( 



COMPOUID STK 

Tin sic 

UUI DIC 



■ ro* un il««m 

• F01 LIST IL<» .TIP » U1TIL «» | 

> poi li.t il<» mili »>i 
ro» uir<utn«(« .I- 



in wf'i.i'i'; 



UITI IXf<«> 

inn »«•>. «»».«•» 

UICOID »!»<•>. LA1IL.TAL««.»' - UICOID »« t MB* ' ' •«»• ' 

iooi E»<b>. meow iw»> ♦ ™»j> "£" J ™JJ J'i,,,, „, 

loot. up<k>. weoio •«<«>, itik.> - com »««" *,™S. *.*.:, ' 



0CCOID »»<■» I C0ID ST««.> 

5TN<» 

•TN<>>. 1T» !£«•>«> 



25.2 
25.) 
25. > 

25.5 
25. « 
25. T 
25. • 
25.9 
25.10 

2*. I 
26.2 

2T.1 
2T.2 
2T.3 
JT.t 
2T.5 
2T.« 
2T.T 
2T.« 



2T.B 

2T.9 

2T.1M 

2T.111 

2T.12 



ronus. PA1 

MIT 



TALUI PUT 

•ptcifiii 

MIT 



hoc bic 



2B.1 
28.2 
21.3 



DIC 
DIC St« 



• ITU 8E9<«>1 
■ Itll ICQMl.'i 



S» 1I0.<.». 5TI..». .01 C0.T< i! .'.<I»»"'Vll" > »> * C0W0U1D STN-KOI. . HD .>, 
TOLI.T.I. - TIPI DICs DIC I/I/l TA1KMAL/IITI0M/1O0LIAI «■»,»• 

\lu"<t> - "pi SclSie i/i/i tam.om iku/imm/MMUi i.i.m 

«.t. ».* m».i »«..» «•»• ««•■» »"t':;,""!^,""";".Jm?'' "*"'* " 0< "" " ocs 

Bin I AMAT8.DII 1 HOCI.BIII I PIOCS.B" • »«0CttBI» I J" " 

.•■.•» 1 . r f;iv!'Vi ,, t"! l S*J"i , i'*» , »"''''" V>A 

10UID PAI1«P> - IMIlTt»IIHI«».lM, 

UUt<|[|]> * »«"* IW1UIM 

M«At<l I >. UUI »IK.M> * *««*' «o«i.«U)>» 

UUI •!•<.>, UUI LIIT<»> •'■J' JlI'IJ}!!;} UUIKIUl/Iinm/MOlU"' UUI ilT»| 

Ml IIT«> * *• LUT«*I.TIM<< ,)'l 

IMTK1>. I* LIIT<t> ♦ •» DIC 1 DIC I TAM<IVITCI UMll.M 



IMT1<1> 

POUUL PM<P>. PU BSLIIH* 

POUUL PM PAIT<A>| 

POMAL PU LIIT««> -> POIIIU. PU PUT<(i)>t 
TAL0I HPM> ( 

IDII1T<I> - TUUI PUTlPUI«»/U.0I >)■>>) 
TTPI<l>*I.>finHn> ,<MUU> 1 



POUMl PUlMII«lil.»t, . 

POUUL PU IHKALTind «)>> 



IDII1T<I> > TIUH nniruKnua >)■•-> 

[iDUiT<i>. ineiriii<«» ♦ iPieiPin iiit.pui«hi = »{■).»! 
"■ciru! lim***'* tPteiPin mwnl'l'i 

IMTl'i>. POMU PU PUTlPUI«f.t f ». «l»l PUTiPUl-...,'. 
IPICIPIII HMtfUK.it >, ITB.I »U»lI TUIll M»ll »»»»'» 



*r»v»AB» ■ hi ■■ w» - - -_ ■ 

UMIHI UUISlB 411*11. « PIOCIlI PI0CI: I PIOCIlI PIOCIlLUIH 

Ll.L2.IITIIIie.IIL «M»«V'p"»f"J ,,< *"" , --"' ! * - "' "-"■"" i *'"-"" ! 



p-rfr ■ i"p'*tr"r---»-p-»f» ' ■ -^ !T •f i *J'• 

«• r ■^■• rr •»;».«•l>V•if*•i , •'•» ,^ »'•»' , *» , • 

<p I .'','»,f'»; , '''»i"p'»«f "i' •«»i ,f i'»»f'»i*'*». "p !P .r*V* 
, *.".' 1 rf , ';* ><, i !, ."" , J >, ""'V' , «f' , ii' , "p«"'"" , P« > 

MFP UflT LXST<f »■<«_*»***'» 

»*" "» ' rf ' " »' (Ij,™,. (MOIIU). ,(««Tell. ,(»I»l UI»t). lf (IITI0H »UAI) 

' p ' f t»oLiu UMT>, r .(ii«. noe«»«), lf .(i«ioi» p»ei««). bf (i00Liu piooiduu) 
/'(ioiml pioeiwn) rf (uiii)« («unu 4 , f (»»aii»» .«> 

..», ..r ».<• l/l/l/IIMCIll TAM.I PAU.l »«M.I UH.I AMATI.I UIAII.l AP.P.AM.P PP0CS 

* ? " ?!5e!!» Ji.Sc! » mSS laIiSIluil rips.aioiid mpj.aioied pioc id. 

illAl^TWWl/lOOLIU/A HOeiDUH If,... KiJip^tJiw^^I^I.J'S 1 

p;.p;.p;.p;.»'»;t»^>»;..M 

TIPI DIC<«> I AUAI »«««• I •» BIC*. I PI0C BIC<d> - BIC««»I 
DIC<d> * B'C Sl««*'i 

DIC<<>, DIC SI«<.> • BIC m<«)<>l 

UTM SIQ.l lUt.I TAM.I VAM.« »AIS.« A11AIS.I AKIAT6.I A1MIS.P PP0CS 
| .1 P10CS.1 PIOCIlI nOCSlLUElllUUL «IPS<..» r l» 1 lT ¥ .r,l» 1 ... l !« k iP,!P l >F»"P,' i " r • 
IDIC ISQ.l TAM.I TAM.I TAM.I TAM.I AUAII.I AMAH . 1 UIATS . » PJOC1 
.1 PBOCSll P100I.I PMd.DIC I TAM.DIC I TAM.BIC 1 TAKlUC 5 TAM 
Die I UlATIlMC I AMATl.Die I AlMTl.DIC 1 PMCS.MC I MOCIrillC I PJO« 

Sc . pp^i'iI: % tImtSi . »».»» » «m^™ «««•««, » *:» t !,s" ' *"*" 

i .DIN 1 AMAH: Oil 1 PIOM.DIH I P10CI.BIII 1 PlOCI.Dm 1 PMCB.UIIl. MPi 

. <«.»;.»• i» k n;. •;.•{. •;.p;.p;ip;'p;'T r i'»n'%«'*.i I SA'*i«™M"'« p i*»« ■« 

i "r."l.' T b.'*..' , ™' , l»'*»""™" > '»"»« :P »"'*' 

fST*<e> . 10T C0IT«t'.>. < IIB.c > . < ILII.e>. ___^_____ 



181 



30.1 
30.2 
30.3 



31.1 
31.2 
31.3 

32 

32.2 

32.3 

32.1. 

32.5 

33.1 
33.2 



3k. 1 
3k. 2 
3k. 3 

3k. k 
3k. 5 

35.1 
35.2 

35.3 

35. k 



35.6 
35.7 
35.8 

36.1 
36.2 
36.3 

3T.1 
37.2 
'37.3 

36.1 
38.2 
38.3 
38. k 
38.5 
38.6 

39.1 

39.2 
39.3 

ko.l 
ko.2 
kC3 
kO.k 
kO.5 
k0.6 
kO.7 
ko.8 



kl.l 
kl.2 
kl.3 
kl.k 
kl.5 
kl.6 



SPEC1:SPEC2 
,COHB 



SPEC HATCH 



SPEC IIST 
HATCH 



USES: PADS 
WITH SPECS 



PAPS; USES 
: SPECS 



ENTRY 
EHTRY LIST 



DIPF CHAD 
DIPF STB 
DIFP EHTRY 



Ll:L2:REl COMF"T r r':T 4 :t"» 

<p r»; , »* , *; > 



p i p I ! 



'ld'V 

•id ! *i' 

'id"!' 



•b't'Sd 1 *;'. 



<i, *»b ! »id !p ; > ' , ' i »'': ! »„d'i': > . 

»irr ehtry «"<v uV ..Wi,WH' 1 A t >rt" 1 

««»»* "■»«(» r )W t H % )(,,)(. r ,(. lH% ,f,;t l , „„„„„.„. 



Vd»' 1 .)(' id )(«..)(-. d )(.„)(. ra )(. I .)(. ld )(, 11 .)( Sa ) 



(p r. ,t »,* ) <»l.>'»u)Cp b )(p )(,)(, ), 

" ,L0CK :J KEaVJEr! 1*11* **" S: " »»"*" = ! ARRAYS:" ARRAYS,. PROCS 
:I PROCStB PR0CS:« PROCS: LABELS :LABEt REPS 



"BEGIH d;i EHD citHyI. 
P J p r» ,p I p l. ,p b»b 



T i'l.= 

.=/\=t; 



b ba 



■n : Vli l, »S, 



Ji^^S',.. 00 ''" 1 '* 1 ' STK< ' >> - PROGRAM STR<p. : 
PROGRAM STR", >, LABEL :VAL«>:». . PROGRAM STR"!' '-.. : 

"Two'crS MO?!' 'i\lfi a " ,A " S:S VA " S ' R ■*«»««■ I 'ARRAYS:B ARRAYS : R PROCS 
.1 PR0CS:B PROCS:B PROCS : LABELS: LABEL REFSiASCHED PROC IBS 
< » : . . : .. : , . : A : A : A : :. : A : A : A : A : A : A : A > 
' ALGOL PROGRAM".., 



TYPE" REAL) "INTEGER., 

DIMH<1>; 

DIMM".. - DIHH<»1. 



'BOOLEAH. 



SPEC< 
TYPE" 
TYPE" 
TYPE" 
SPEC" 

SPEC' 
TYPE' 



<LABEL»<S»ITCH..<ARITH EXP. ."BOOL EXP> ,"ASGSEO» <YALUE> 
..„, * "" <t> -' TALUf: f.«*S01» t..<ASOBED VALUE V 

~ SPEC<ARRAY»,.t ARRAY.. <t ARRAY!. )..<»ALUE t ARRAY! 



- P "pEC S LisT<lLTSE^"°)fi""' E>, ' t P "° CtI "" ,E * '•** "OCEDURE! . j..",OHVAL PROCEDURE!.) 



PROCEDURE! «):IOIVAL PROCEDURE!.). 



•" ~ SPEC1:SPBC2:C0MB<A:. :..,<.:..... 

i.RA?*?'I!!.A;, 8 w C1 i!IfS? , ? 0,O * Ai,, * ,, "" l »»»«(«> = e"L ARRAY(«)> 
„. A ™ ' ARRAY(«):t ARRAY(.)>i"t,VAI,UE:VALUE t» ,"t ,ASGHED:ASCBED t 
TYP?*«° siJ? ?^/"" 1 J'*" *»»"!-) >T*Liai»AlUI I ARRAY!.);, 
5 ili/if.*." LIST< " * SPECl:SPEC2:C0HB«PROCEDURE:I0B»AL PROCE 
<t PROCEDURE:! PROCEDURE!. ) :t PROCEDURE!.) 

EXP SPEC"A>, "VALUE., "ASOHED VALUE.* 

SPECl:SPEC2,C0HB"»:t:c. - SPEC HATCH<.:«>[ 

"".OoTeXP:. BOOLEAB>- * SKC ■ ttT «*» 1 " «»■• »»M<A.IT. EXP:. I.TECER. 

SPEC HATCH". :t. ' . SPEC , IST „.,..«. ,, 

SPEC HATCH.. :t>. SPEC LIST HATCH..':!-. -. i»c uSt Em!::.^*. .,,, 

IDLIST"t> * USES: PARS HITH SPECS"A:1,>; 

IMT *!i:' "' K »'»»K»'eOIO«.ttie>, USES, PARS WITH SPECS"* 

- USISiPARS VITH SPECS<«l!t):»l e.y>; 
IDSTR<1., SPBCl:SPEC2,C0HB".:t:c>. USES:PARS WITH SPECS",.! 

- USES, PARS KITH SPECS<».i(t ) :,1 . ,. , SPECS".! 
E "™l'.ll>l'.l S "«=S"«:CO«B«.,t(p),c. U8ES:PARS HITH SPECS". 
».;.; S f?' PA " B """ SPECS<ul(p)(t)„ lc, 7 ., 

! ™ii P i > .:. a !;? :SPEC2,Ce " n<,!t(p,!l!> . USIS.PABS WITH SPECS.ul 
— « " !PAIIS *"" »*W«»,J(p)(t)iMt c.y., 
PARS:USES:SPECS<A:AsA»; 

?S!. ,A,S """ *"<">"»*' - PARS:USES:SPECS«A:u:,.j 

IDSTR.i.. PARS,USES:SPBCS<p,»„l,,. , PARS:USES,SFEC»",>1 ,:«:iy. 

ID<1>, SPEC LIST<«., DlHM".. » EETRT"i> ."1 (•)>. <!<»)>; 
EHTRY LIST«A.j ' * 

EHTRY LIST"!., EBTRY".. - ERTRY LIST<«,«.; 

PIPE CHAR<A:>> ,<A:C> <[:)>; 

CHAR STR OR HULL".!.. ,<-,yt> , DIPP CHAR"x:y 
ID STR<l.,<j., DIPP STR«l:j., SPEC LIST".. 



t),«L.y> 
Xl.tP. 

t),m.,r> 



~ DIPP STR<ax. 
<t. - DIPP EHTRT<1 



,«rt.. 

:J', •!(•): 



ID STR<1., SPEC LIST MATCH'. :t>, DIMM<a> » EHTRY HATCB<l!l 

EHTRY MATCH<e:t'' * IH«e*«' >• 

IH<.:«>, ERTRY HATCH..:.-. - IB<. I. • '.li ,<.,l.-.> ■ 

IH«.:1., DIPP EHTRYO:.'. - IH<.:.' ,1. ,<•:!.> >■ 

EHTRY'.. . , 0T IR<.. A> . 

HOT IH<i:l., DIPP EHTRY<.:.'. - HOT IH<.;.' 1. • 



:J>,«l:J(t 
..«l(.):i!t).,<l(.):t( 



)>.<l(.):J(t). 
• )>; 



HOT COST F CHAR STR OR HULL<.» 

HOT C0HT-.»it., DIPP CHAR<I,y. 
HOT C0HT<.««:ty., DIPP CHAR<i,y> 



DIPF EHTRY 

LIST 
DISJ EHTRY 

LISTS 



L1:L2 

IHTERSEC 



L1:L2 
:REL COMP 



HOT COHT<.: >; 
HOT COHT<*z:ty>; 
HOT COBT<.x.:ty.. 



EHTBY<e> ( HOT IB<«: 



EHTRY LIST« 



DIPF EBTRY LISKA 
DIPP EHTRY LIST<< 
EHTRY LIST<1. 
LIST OP LISTS, UHI0H<1: 
EHTRY LI5T<1. 
DISJ PAIR OP LISTS<1:1 
EHTRY LIST<£. 

DISJ EBTRY LISTS<1>, LIST OP LISTS:UHIOH 
- DISJ EHTRY LISTS<t(l')>i 



- DIPF EHTRT LIST<..l>i 
• LIST OP LISTS:UHIOB<(i!,l>l 

- LIST Or LISTS:UHI0K(O,(i' ):u 
r DISJ PAIR OP LISTS<i:A>; 

, EBTRY".., HOT IH<«:1» - DISJ PAIR OF LISTS<i,.,i 
♦ DISJ EHTRY LISTS" U)>; 

DISJ PAIR OF LISTS<v:i*> 



EBTRY LIST'l. - LI ,L2: IHTERSEC: "I , A :A., L1:L2:REL C0MP<1-A-A>- 
L1,L2,HTERSEC<1,1 , ,.1. 1 EHTRY".., IH<.,1. - LI ,L2:IHTERSEC' 

L1:L2:IITERSEC<1:1 , :1., EBTRY".., HOT IH".:<. - LI :L2:IHTERSEC 
L1:L2:BEL COMP"! ,1 ■ :r. , EBTRY".., IB".:1. 
L1:L2:REL C0HP"t :i ' :r> , EHTRY".., SOT IH«.,t 



L1:L2:IHTERSEC<1:1':1>, L1,L2:REL COMKlsl- 



..«':., 1- 
«.i':l'; 

- L1:L2:REL COHP" I :.,! ' :r- ; 

- L1:12:REL COMP"!:.,!' 



- L1:L2:IHTERSEC:REL COHP" 
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Appendix k.2 CABOIfcAt ST3TEM SPgCIPTgG THE TBAH5LATI0B 
OF ALGOL/ 60 IBTO TBI TARGET LAEOUAOE 



uasisa BUM 



6.3 

6.k 

7.1 
T.J 

e.i 
a. 2 
a. 3 

e.k 



m 

■UN 



IBLIST 
VA« 



ARITH ED 



10.2 
.3 

10. k 
.5 

10.6 
10. T 



11.1 
11.2 
.3 
11. k 
11.5 
11.6 
11. T 
11.9 
11.9 
11.10 
11.11 

12.1 
12.2 
12.3 
12. k 
12.5 



13. 

Ik. 

15. 

16. 

IT. 

18.1 
1B.2 

ia.3 

lB.k 

18.5 
16.6 

19.1 
19.2 
19.3 
19. k 
19.5 



EXP 

DUMKY STH 
COMMIT STH 
00T0 STK 
PBOC STH 
A5CT STH 



BIOIT ST»"«»,<t> » UBSIOB IUN"...'« , >,".t..(TRAIS FBAC 't , )>. 

<>.t..(«(TBAIS I»T ••', THAIS FRAC "t'))>; 
UHSIGI 1«<1> ~- I«T«l..'i , >7<»i.. , .'»,<-l.. , -l , >, 
UISIGI aUN<n..a , > ' IUM<n. .«•> ,<♦«. .«•• ,"-n. . (IEGATE n')>; 

I0STR<1> - ID:»HE FORNALSiOVB VARS"l..llA:A>,«i..(l 'A'>:1,:A>, 

<i..l:J:l,>; 
IDSTR<i> - IBLIST<ALTSE0.(1 ,)>; 



ABITH EIF<«..« 

ARITH EXPO.. 4 

ID<1..1'> 

IB"!..! 1 ', SUBSCRIPT LIST«l..l'> 



.'* SUBSCRIPT LIST"...(C0BV TO IBT ■' 



AB4JB BAT*.... ' .' suoownir. u»oi -•.. i w« »*«_*.. . > i~» _ 

ARITH EXP<....'>, SUBSCBIPT HST<l..t , > - SUBSCRIPT LIST<1,«. JJ" ICOIV T0_IIT .'J>; 
ID<1..1'> - REAL/IIT/BOOL VAR«i..l T >i 

- REAL/IBT/BOOL VAR<1 |l ). . I GET EL (i',l'))>l 



IB<1..1*> 

ID<1..1 , > 

ID<1..1*> 

REAL/IIT/BOOL tAH<»..»' 

ARITH EXP<». .»'> 

BOOL BZP«b..b , > 

Dll EE7<d..<' 



- ACT PAR<l..>w.i' 

* ACT PAR"!..*".!' 
-» ACT PAR"i..A».l» 

* ACT PAR-V..AW.Y' 
» ACT PAR<». .»«.«' 

- ACT PAR<b..Ai.b' 

* ACT PAR<d. .!■.*■ 



ACT PAR«p..p'>, PAR DELIM<d> -ACT PAR PAHT«ALTSE8(P d) . .ALTSEQtp' 
ID<1..1'> -REAL/IBT/BOOL/IOIVAL PCB BES<1..<1' 

ID«1..1'>, ACT PAR PA«T"p..p'>-REAL/IIT/BO0L/I0IVAL FCI DES«i(p).. 
REAL FCI 0ES<f..f> | IBT PC» BIS<f..f"> | BOOL FCI BES<f..f> 
| I0IVAL FCI DES<f..f , > - FCI »ES"f..f">i 



■AM>1 
ll'lp'.ll' 



UISIOI IUM«p..p'> | REAL VAR<p..p 
I IBT FCB BES"p..p'> - PRIM-p 
ARITH IIP<«..«'> 
PRlH"p..p'>, RULT 0P"»* 
TEBN<t..t'>. ABB OP<«> 
TERN SEQ<I..i'> 
SIMPLE AHITB EIP 
BOOL IIP«b..b'>, 



| IIT VAR"p..p'> | REAL FCH BES<p..p* 



- PRIH<(. )...'>; 

* TE«M<ALTSE8.(p •)..COMB(p' »)>! 

* TERM SEQ"ALTSEQ(t »)..COMB(t' »)> 

* SIMPLE ARITH EXP". ..*•>.*♦■. . 
• ..••> ■> ARITH EXP«..i'>; 
SIMPLE ARITB EXP" , ARITH EIP<....'> 

- AIIT« EIKIF 6 THEI t ELSE «'..b'«»< •• ELSE' 



<->.. (IEGATE •')» 



BOOL PRIM" TRUE. . 'TRUE' > , 'FALSE. . ■ FALSE • > ; 

SIMPLE ARITB EIP««. .••>,"b. .b' > , REL 0P<r> - RELATIOI".rb. . (rU'-.b' ) )• ; 

RELATIOH<p..p'> | BOOL VAB<p..p"» | BOOL fZH BES<p..p'> •• BOOL PRIM«p..p'- 

BOOL EXP<b..b'> * BOOL PRIM< (b ). .b' > ; 

BOOL PRIM<p..p'> - BOOL SEC"p..p*>,< p..( p')>; 

BOOL SEC"I...'> - BOOL FACALTSEO.il A)..COMB(l- A)>; 

BOOL PAC"f..f> - BOOL TERM<ALTSEQ( f VK.CONBIf' V)>i 

BOOL TERM<t..t'> - BOOL IMP<ALTSEQ(t 3)..C0MB(t* 31>; 

BOOL IHP"1..1'> - SIMPLE BOOL- AITSE«( 1 l)..COXB(l' 5)>; 

SIMPLE BOOLo..i'> - BOOL EXP<l..s'>; 

BOOL EXP"b..b'>,"C..e'>, SIMPLE BOOL"a..s'> - BOOL EXP<IF b THEH a ELSE c. 



b' ^ a* ELSE **<:' 



LABELiVAL<t!Y> - SIMPLE BES EXP"t.. «". .»-i 

ID<1..1'>. ARITH EXP"...a<> - SIMPLE BES EXP" I !•)..( (GET EL(COIV TO_IIT .'.l')) •>•)■ 
BES EXP"d..d'> - SIMPLE BES EXP" (d) . .*■> ; 

SIMPLE BES EXP<a..a'> - BES EXP<a..l'>; 

BOOL EXP«b..b'>, SIMPLE BES EXP<l..a , >, DES EXP"d..d'» - DES EXP 
<IF b THEB a ELSE d..b* .pa' ELSE *d'>; 



ARITH EXP"«..a*> I BOOL EXP"e..e'> | BES EXP"e..e' 
BUMMT STH«A..'A'>; 

STR<>> - COKHEIT STM COHHEITo . . ' A' > j 

BBS EXP"d..d*> + GOTO STM<00 TO d..(COTO. d')>; 
FCI BES"f..f> * PROC STM<f..f'»; 



EXP'e. 



IBSTR«1> 

REAL/IIT/BOOL IU'I.,1 1 



- 5/I/B LEFT PART<1. 

- R/I/B LEFT PART"!. 



(1# ASSIGH. »)>; 
IBSTR"I> - R/I/B LEFT PART"1..L£T »-l' II 

( . ASSIGI. •)>; 
REAL/IIT/BOOL VAR" 1 [I ] .. (OET ■L(1 > .1')> - R/I/B LEFT PART" i li ).. LET »-!• II 

ASSIGI. (RESET_EL(< 
R/I/B LEFT PART«l..l'>, ARITH/ARITB/BOOL EXP<« 



R/I/B LIFT PART"!..*' 
R/I/B ASOT STM«...i"> 



, R/l/B ASGT STM"S..S' 
* ASGT STM<«..«*>t 



- R/I/B AS 
LET ..(COHV TO_REAL/COIV TO_HT'ILEI 

- R/I/B ASOT STMl:-«..l';T' " 



AltTTK EXP<« ft*> ~* P° B LIST EL* a. . A ■ . • ' ' ; 

i«ITH EXP".". 1 ' "b b->.<e..c'> - FOR LIST EL". STEP b UBTL c. . »■ . (STEPl >..a' ,i. .b 

ARITH EXP"«"«'>" BOOL IIP<b..b'> - FOR LIST EL". WHILE b. . ». . (VBILEl »» ..' ,».b' ) )> i 

FOR LIST EL"...«'> * "» LIST«ALTSE9( .. ) . .ALTSIQf. 1 .)'! 

REAL/IBT VAR'T .»". FOR LIST«t..f>, STM"i..l'> - FOR STM<FO» »;-t BO ...(FORIt'. BELAT_CAT |l^.t ) 



lf.c')l> 
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mazm^K 



20.1 

20. > 

21.1 

21.2 

31.} 

22.1 
22.2 
22. J 



2».l 
2k. t 

25.1 
25.2 

25. »l_ 

25. n 

25.1 
25.* 
25. T 

25. a 

25. » 

25.10 

2<.l 
2S.2 

2T.12 



uicoid im 



coid m 



in 

•IH SI« 
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conpouid ITU 

TIH DtC 



UUT DIC 



m dic 

PPOC DtC 



21.1 
21.2 
2».» 



30.1 
39.2 
33.3 



12.2 
'2.3 

12. » 
1-2.5 



IMCOn SIM •...• ; LUn,TAL 

•901 »»'».•*•'. oaeoiri tiii<, 

, * ^n r uwi . • 

•OOL 

k 

con 

tneon sn<« | con •»<•..■ 

STN<|..|*> 
:»•»..■•» sin SIQ< 4 .,q 

am 5«q<i>, jiic,> 



- eon »TKrLAiru<ir t i>t> 
itAiiu<ip t mi , 



* sriH t 

5TPJ S14<i.,i 

sin in<in. 
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ARin IZP«....«> «!. !.•> l * 



• aom»r*i •<•*:*» 
- »m«i«». .•!»>; 



• lb- 



- »fU»I-.. t .. V .|, b., 

- inifiliur IM<l{ll..t>(J)MtXI<Tg}fJ(M) I l. 



SIC 
DIC SIS. 



• LOCK 
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PP03PAII 



list.copp 

•UIL LIST 

piSTiCOIP u_- 

3IUI LIST 
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■mi LIST 
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II 



If 



IP' 



sic*..,.,., aw «,....,« ...,., „ s? iw.rii^K"!::..,.,,^ 
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•Loci<p..,<> | cewons •!»<»...•> 



ppoopak m<p..p- 
Pkooww mui' 



- A100L PI00MIH...LIT » > . l!»."ll ,* " 



LISTlCOSP »UIL LI»T</. i,\ >i 
LISIiCOPP ItlU LI8I-1,.- . ID<TP<1> 
LIITtCOIP nSIAPI III!.],!. 
UaTiCOPP UISIUI UIT<lta>, I9»IP<t» 

tm^opp mint m-..,;., . 
lkticopp inms LHT<ti«>. iDsn<i> 
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- uiTieoaa-rau uim.it-a',!'! 
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184 



*t»MM* k.i M.T1ITIW «r Fimmw men 

».t «.fi r itl.«. f.r itrlu WfllltHi I t«M«» I *.«.»•«• I 



MOM 

CM* 



m 



»1§IT«0>,«1> <»»l . „ 

Utm<t>,<i>, ... ,«t>,« ,, «-», «*»•■» »*»» 

HUg-" !"-*-, ... •<t > i 

sioift* I tmt»«i* I ■»««»* • cto«f>i 

•Tt<»M 



CAT « • £ •. ■»• "[*♦•" t 

*<...>- [ :;: :: 
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itwi 

fAL»I 



at(«,i) • 
■o* ■ • 

IB • ■ 
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gnuii ■ • 
in i • 



cms/not *• 

nm/Htn »• 

run/tan — 

fAMI/IAMI — 



MUI 

run 

PALI! 
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•/• 



<*• run 
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•»> Arlth»«tle twtiiilM prl»ltlT»» (»«« arltkaotle prlaltlvoa for definition! of 

♦ "a i) ■ . . 



TRAIS_IBT ■ • 

TRAI8_PRAC a - 

COBV_TO_RBAL a • 
IITIIl Z - 

COST TO IIT Z - 






/«/ 
sDl 



/•/ 



/•/ 



/■«0/ » /.a/ 

/■d/tDr. - /i/dtBOr 

1 1 Til. ♦• tBlr 

/«/ * /•/» 

«Dt. *• aBt "I 
■ . »• »D1 J a 

LIT A, B • ION I, DO X 
IB /U.B) 

utiiiK* (x, 'iB2t)) 



(*) Arlth»etlc prl«ltlT»« 



soce 



/•0/r. 
/•1/r. 



• Ir 

• 2r 



/■S/r. *• >9r 

/»9/r. ♦ /i/Or 

//r. ♦ . i r 

/0/. ». o 

/l/9r. — 9r 

/«0/r. » /»/9r 

/■1/r. •• *0r 

/•2/r. ■»• iir 



/•9/r. 



*8r 



/a/ 



/•/ 



RIC -(I.I) • 

RIC SUN(Z.T) « IQ(T, •<>•) 



H<T, ■<>•) -^Z 

ELSI =91 (MID Z, PUIS T) 



SLSI ^ SOM ( SOCC X, PRXD T) 

LCSS (Z,T) • BB« (i(Y,Z), •<>•) 

ric prob(z.t) - taiT.'o') ^ 'o> 

ZLSI ^ BUM (X, PROB(Z, PUD T)) 

DIFF (Z.T) • LtBS(Z.T) ** IIOATI(i(T.X) ) 

l«(X,T) ^ '0- 
■L8I 9 -(1.1) 

RIC qaOT(Z.T) - LBSS(Z.T) ^ '0* 

BL8I ^ 8UM('l',Q00T(i(l,I), I)) 

FBI SUN(Z.I) • AID(I8 IBT Z, IS IIT T) =? SON(X.I) 

ILSI ~ ~ =* LIT B1,B1,I2,D2 • BOM X t MB X, BOM T, M« » 

IB LIT B • DIFP(PI0B(B1,D2), PR0D(B2,D1 )) 
IB LIT D • PRODtBl.BZ) 

IB KAIIJtIALd.D) 

FBI DIPP(Z.T) - AID(IS IBT Z, IB IBT T) ^ DIFP(Z.T) 

BL8I " ~ ^ LIT 11,01, S2.D2 • BOH Z, DM Z, BON T, DBB T 

IB LIT B ■ DIPF(PR0B(Bl,B2),F*0B(B2,Bl)) 
IB LIT D • >R0D(B1,D2) 

IB KAK_MAl(I,D> 

FRI_PROD(Z,T) - ABD(IS IBT X, 18 IBT T) ^ PROB(X.T) 

(LSI ~ J? LIT I1,B1,B2,B2 • BON Z, DIB Z, BUN T, DBB T 

IB LIT I • FR0D(B1,B2) 

IB LIT B - FR0B(B1,B2) 

II NAKI_BBAL(B,D) 

FRI_QOOT(Z,T) - AID (18 IBT Z, IB IBT I) ** «OOT(Z,I) 

■LSI ~ ^ LIT II, 12, 12. 02 • BON X. OBI X, ION T, DII T 

IB LIT B • PR0D(B1,D2) 

II LIT D • PM0(B2,D1) 

IB MAIt IIAL(I.D) 
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SIGB(X.Y) • AKD(IS POS X, IS POS T) => 'A' 

AID(IS~FOS X, IS~»EG T) =* ■-• 

ANO(IS~IIEO X, IS~P08 T) =p '-' 

ELSE ~ ~ ^=* 'A' 

»(X,T) • AID(IS POS X, IS POS T> ■=# PRI SUM(X,Y) 

AID(IS~POS X, IS~BIO T) =S PRI~DIFF(X, AW T) 

AXD(IS _ IEO X. IS~POS T) -» PHI MPF(Y, ABS X) 
ELSE "" -^ REGATE(FRI_SIW(ABS X, ABS T)) 

<(X,Y) « LET S • SIGI(X.Y) 

III CAT(S, PRI_PROD(ABS X, ABS X}) 

/(X,T) ■ LET S « SIGX(X,Y) 

IK CAT(S, PRI_QU0T(A8S X, ABS Y)) 

-(X.Y) • ♦ (X, IEGATE Y) 

«<X,Y) » LET S • SIGR(X.Y) 

III CAT(S, ERTIER(ABS (/(X.Y))) 



( e) Boolean p 



rl.it 1th 



iX - HOT X 

A(X,Y) - AID (X,Y) 

V(X,Y) - XOT(AID(IOT X, NOT Y)) 

3(X,Y) - IOT(ABD(X, HOT Y)) 

s(X,Y) - EQ(X.Y) 

PPI LESS(X.Y) » LET B1,D1,N2,D2 • IUM X, DEI X, ION Y, DEB Y 
IB LESS(PROD(lll,D2), P»OD(B2,Dl ) ) 



«(X,Y) - AID(IS POS X, IS POS Y) ^ PRI LESS(X.Y) 

AID(IS~POS X, IS~NEC Y) =# FALSE 

AID(IS~IEG X, IS~POS Y) =* TRUE 

ELSE " =» PRI LESS (ABS I, ABS X) 



-(X.Y) - 


EQ(I.Y) 


rf(X.Y) - 


RE«(X,Y) 


<(X,Y) . 


V(<(X,Y), - (X,Y) > 


»(X,Y) - 


IOT(<(X,Y)) 


><X,Y> - 


NOT(i(X,Y)) 


(f) Tor 


statement prlBitlTcl 



REC STEP(A,B,C) • LET A'.BJC; " (A 'A'),(B 'A'),(C 'A') 
II AIDtIS POS B; LESS(C'A')) =* 'A' 

AHD(IS - IEO BJ LESS(AJO) *> 'A' . 

ELSE ~ a»£A' + »..(STIP(>«. (♦(AJB:)),B,C)fl 

REC WHILE(A.B) • LET A;B' ■ (A 'A'),(B 'A') 

II IOT B" =& 'A ' _ 

ELSE =9^' t )l<i.(UCILE (A,B))J 

REC DELAY CAT L ■ LET H.T • HD L, TL L 
II LET H' • (K 'A' ) 

IB EQ(T, 'A') =9 H'- 

EQ(H', 'A') =» (DEIAY_CAT T) 
ELSE '*P'* T 1 

REC FOR(V,L,S) • LET H,T - HD L, TL L 

IB EO.U, "A") =9 'A' ,, - 

ELSE =3 (IS IBTV) •♦ (VASSIOB. (COIT TO IBT I)) ELSE ^ 

'Vasijgi. (eoiv to rial i)7i 

(S -A 1 ); 

FOR (», (DELAY CAT T), S) 
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( g ) Array aod Hat prlmltlyeB 

GET_EL(I,L) - [ r(l,«)t. *• • ] L 

RESET_EL(I,L,X) - [ r(I, B )t. ->• r(I,X)t ] L 

REC IIDEX_LIST(I,L) - LET H,T - HD L, TL L 
II BULL T => (I, a) 



ELSE =St(I,H) + IIDEX_LIST(-f(l,l), T )]J 

BEC LAST L - LET H,T « HD L, TL L 

III HULL T =i>H 

ELSE =£ LAST T 

REC TRUIC L - LET H,T ■ HD L, TL L 

II IULL(TL T) =£HD T 

ELSE =*&« TRUIC TJ 

REC ADD1(SUBSLIST,LB.UB) - LET S ,S ,S ,T ,T ,T - LAST SUBSLIST .LAST LB, LAST UB .TRUIC LB , TRUIC UB 
II ■EQ(S 1 ,S 3 ) =*?!,. (♦(Sj^, * 1 ' ) 
ELSE =i>^DDl(T 1 ,T 2 ,T 3 ) < S^] 

REC MAKE_LIST(I,LB,UB) - EQ ( I , UB ) ^ (I, 'A') 

ELSE =^Dl. '»' ) ♦ MAKE_LIST( ( ADD1 ( I ,LB,UB) ) , LB.UbJJ 

REC RESET_LIST _ EQ(J,UB) =s> (J, GET_EL(J, ARRAY)) 

(ARRAY, J, LB, UB) " ELSE =>C J « °ET_EL(J, ARRAY ) ) . RESET_LIST( ( ADD1 ( J , LB ,UB) ) ,LB,UB )] 



188 



Appendix 5. THEORETICAL BACKGROUKD 
FOR CAHONICAL SYSTEMS 



The intent of this appendix is (a) to describe and 
relate the formalisms of Post's formal systems and 
Smullyan's "elementary formal" systems, (b) to show that 
the formalism of "canonical" systems presented in this 
dissertation is equivalent (except for changes in notation) 
to Smullyan's elementary formal system, and (c) to show that 
the terminology and interpretation of canonical systems 
given here relate to the terminology and interpretation of 
the formal systems of Post and Smullyan. 

A formal system will be described by giving 

(a) A set A of primitive symbols : For example, this set may 
be the symbols {0 1 ... 9) or the set of characters in 

a computer language. 

(b) A set C of auxiliary symbols ;* For example, this set 
may include the symbolB {SQ + *}. 

(c) A set S of initial statements composed from the primitive 
and auxiliary symbols: The set S will be composed of 
strings from AUC.** 

(d) A set E of well-formed expressions : The set of well- 
formed expressions will generally incorporate symbols 
from AUC and other symbols. 

(e) A series of rules for using the well-formed expressions : 
The rules will be used to derive new statements contain- 
ing the primitive symbols from the set S of initial 
statements . 



•All sets of symbols in the systems of Post and Smullyan are 
assumed to be disjoint from each other. 

••The symbol "\j" denotes the binary operation of set union. 



189 



(f) An interpretation of the formal system: Strictly speak- 
ing, an interpretation is not part of a formal system. 
An interpretation is placed on a formal system by a user, 
who wishes to draw conclusions about the objects that 
the symbols of the system represent. 



POST'S SYSTEMS 

(a) Primitive Symbols 

Let A be a finite set of symbols {A.. Ap ... k ± } . 

(b) Auxiliary Symbols 

Let C be a finite set of symbols {C Cg ... C }. 

Let L be the set AUC , the union of the sets A and C. Post 
calls the set L the set of "primitive letters" and does not 
distinguish the sets A or C. The sets A and C are distin- 
guished here to clarify the distinction between a Post system 
and a Smullyan elementary formal system. 

(c ) Initial Statements 

The initial statements S are a set {S.. S„ ... S }, where 
each S., l^i^k, is a string of letters from L. 

(d) Well-formed Expressions 

Let V be a finite set of symbols ^ V g . . . V^} called 
variables . 

A premise is a string of symbols from LUV. 

A conclusion is a string of symbols from LUV. 

A well-formed expression is a string of the form 

"Q ,Q Q m produce^ c " where the Q £ , l<i<m, 

are premises and C is a conclusion such that each 
variable in C also occurs in at least one Q.. A 
well-formed expression is called a production . 

A set E is a system in canonical form if E is a finite set 

{P, P_ ... P >, where each P., l<i<n, is a production. 

12 n ' 1 * — — * 

(e) Rules for Using-Formed Expressions 

Rule 1: A string X is called an instance * of a production P. 
if X can be obtained from P. by substituting for 
each variable in P. some string (possibly null) of 
letters from L. Tlie string substituted for each 
occurrence of the same variable must be the same. 



*The word "instance" is not used by Post, 
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Rule 2: If each premise in an instance of a production has 
been derived, then the conclusion of the production 
can be derived. 

The statements derivable from a Post system are 

(a) The initial statements 

(b) The statements that can be derived from the 
productions by first applying Rule 1 to obtain 
an instance of the production and then applying 
Rule 2 to the production instance. 

(f) Interpretation 

A production can be viewed as a rewriting rule for obtain- 
ing new statements from previously derived statements. 
The interpretation of the derived statements are subject 
to the interpretation of the initial letters. 

Example 1 : A Post System Defining the Set of Squares of 
Positive Integers 

(a) Primitive Symbols A = {1} 

(b) Auxiliary Symbols C = {SQ} 
L = {1 SQ} 

(c) Initial Statements S = {1SQ1} 

(d) Well-formed Expressions V = {u v} 

E = {uSQv-*ulSQuuvl} 

(e) Derived Statements {1SQ1 11SQ1111 111SQ111111111 ...} 



(f ) Interpretation 

ig 
sents the positive integer denoted by the number of 

ones . 

sents the positive integer that is the numerical 
square of the integer to the left of "SQ". 



The string of ones occurring to the left of "SQ" repre- 



The string of ones occurring to the right of "SQ" repre- 



Example 2 : Another Post System Defining the Set of Squares 
of the Positive Integers. 

Hote : The intent of this example is to illustrate that the 
"canonical systems" given in this dissertation fit 
the definition of a system in canonical form given by 
Post. 
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(a) Primitive Symbols A = {1} 

(b) Auxiliary Symbols C = {N:SQ < > :} 
L = AtfC * {1 N:SQ < > :} 

(c) Initial Statements S ■ {N:SQ<1>} 

(d) Well-formed Expressions V = {u v} 

E * {N:SQ<u:v>-»-N:SQ<ul:uuvl>} 

(e) Derived Statements 

{H:SQ<1:1> H:SQ<11: 1111> N: SQ<111 : 111111111> ...} 

(f ) Interpretation 

The string ™H:SQ" is the name of a set. 

The string "<x:y>", where x and y are strings of ones, 

are members of the set "H:SQ". 
The string of ones before the n :" represents a positive 

integer; the string of ones to the right of the ":" 

represents the square of the positive integer to the 

left of the " :". 



SMULLYAH'S "ELEMEHTARY FORMAL" SYSTEMS 2 

Smullyan's elementary formal systems are a descendant of Post's 

formal systems. 

(a) Primitive Symbols 

Let A be a finite set of symbols {A., A„ ... A.} called 
the object alphabet. 

(b) Auxiliary Symbols 

Let P be a set of symbols {P^ P g ...} called the predi- 
cate alphabet. With each predicate alphabet symbol we 
associate a unique positive integer called its degree . 
Let Z be the set {,-»}. The symbol "-*>" is called the 
"implication sign and the symbol "," is called the 
"punctuation" sign. 
The set C of auxiliary symbols is the set PUZ. 

(c) Initial Statements - None 

Smullyan includes the initial statements as members of 
the set of well-formed expressions. 



(d) Well-formed expressions 
Let V be a set of symbo 

varlableB . 
A term is a string from VfA. 



Let V be a set of symbols {V V_ ...} called the set of 
varlableB. 
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A veil-formed atomic formula 1b a string of the form 

"Pt ,t ? , . . . ,t " where t., 1<I <k, are terms and P is 
a predicate of aegree k. 

A well-formed expression is either an atomic formula or 
an expression of the form X. -»■ X„ ...-»■ X (assuming 
association to the right; e7g., "X + X„ -»■ X " is to 
be read "X implies (X- implies X 3 r" ) where X ± , 
l<i<m are atomic formulas.* A well-formed expression 
is" called a well-formed formula . 

A set E is an elementary formal system if E is a finite 
set {F F. ... F } where the FJ , l£iin, are well- 
formed formulas, called axioms . 

(e) Rules for Using Well-formed Expressions 

Rule 1: (Substitution) A formula F' can be derived from a 

formula F by substitution if F 1 can be obtained from 

F by substituting a string in A for each occurrence 
of some variable in F.** 

Rule 2: (Modus Ponens) A formula F' can be derived from a 
formula F by modus ponens if F is the form X -+ F' 
and X is some previously derived atomic formula. 
More generally , a formula X can be derived from a 
formula of the form X * X -► . . . * X Q _ 1 -*■ X q if each 
X., l£i±n, is an atomic formula and X^, X-, ... .^.j^ 
have each been previously derived. In this case, 
we refer to the X., X , ... , and X Q _ X as premises , 
X as a conclusion , and say that the'conclusion X q is 
derivable from the conjunction of the premises 
X x , X 2 and X n _ 1 >*» 

The "provable strings" of an elementary formal system E are 
(i) the axioms of E 

(ii) the strings that can be derived from the axioms by 
a finite number of applications of rules 1 and 2. 



•Note that no restriction is placed on the use of a variable 

occurring in X but not in X., l<i<m-l. 
m i — — • 

•*ln an elementary formal system, it is not necessary to 

substitute object strings for each variable in formula to 

derive strings from the well-formed formulas. Thus we can 

derive strings containing variables in an elementary formal 

system. In a Post system, we must substitute object strings 

for each variable in a production before we can derive strings. 

•**If each variable is replaced by an object string, this 
generalization of modus ponens is identical to rule 2 for 
deriving strings given by Post. 
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An instance of a well-formed formula F is a string obtained 
from P by applying rule 1 (substitution) to all variables in 
F. A formula so obtained is called a sentence . 

The "provable sentences" of an elementary formal system E are 
the provable strings containing no variables. 

(f ) Interpretation 

Let P be a predicate of degree k in an elementary formal 
system E, and let Y be a set of k-tuples of strings from 
A. We say that the predicate P represents the set Y if 
the following condition holds: PX ,X„, ... ,X is a 
provable sentence in E if and only if the k-tufle 
(X^, X 2 , ... »X. ) is contained in Y. 

Thus an elementary formal system can be viewed as a set of 
axioms used to enumerate the members of sets whose names are 
denoted by the predicates. 

Example 3 : An Elementary Formal System Defining the Set of 
Squares of the Positive Integers 

(a) Primitive Symbols A - {1} 

(b) Auxiliary Symbols P = {R} Z - {, -*-} 

( d) Well-formed Expressions V = {u v} 

E = {Rl,l Ru,v ->-Rul,uuvl} 

(e) Derived Statements 

{Rl,l Rll.llll Rill, 111111111 ...} 
The derived statements given above are (in the Smullyan 
sense) the atomic sentences derived from E. 

(f ) Interpretation 

If R is the name of a set, the ordered pairs 
{(1,1) (11,1111) (111,111111111) ...} are the members of 
R. We interpret the set R as containing all ordered pairs 
such that the string to the left of the "," represents a 
positive integer and the string to the right of the "," 
represents the positive integer that is the square of the 
integer represented by the string of ones to the left of 
the '»,". 
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CANONICAL SYSTEMS (as presented in this dissertation) 

The formalism called "canonical systems", as presented in 
this dissertation, is equivalent (except for changes in nota- 
tion) to Smullyan's elementary formal systems. 

(a) Primitive Symbols In this dissertation the primitive 
or "object" alphabet is the set of characters used in 
some computer language. 

(b) Auxiliary Symbols The predicate alphabet P here is a 
string of English letters or digits each separated by 
the tuple sign ":". Each string of English letters of 
digits is called a predicate part, and the number of 
predicate parts in a predicate is usually identical to 
the number of terms in a term tuple following the predi- 
cate. The separation of predicates into parts is made 
(a) to give some mnemonic describing the role of each 
term in a term tuple following the predicate, and (b) to 
provide a convenient notation for abbreviating a canoni- 
cal system. 

The set Z is given as { : ->•} rather than { , -»•} since 
the comma "," is a character occurring frequently in 
computer languages. 

(d) Well-formed Expressions A well-formed formula 
"X -*- X -*-...-»- X -»■ X " is written here as 

"XT', X-7 ... ,X n + X " So connote the meaning that 
X is derivable 11 ? rom a n canonical system if and only if 
each of the instances of the premises X , X g , ... » x n _ 1 
are derivable. This alternate formulation is in the 
spirit of Post, 

The delimiter ";" is introduced here to separate the 
well-formed formulas of a canonical system. The well- 
formed formulas in a Smullyan system are separated by 
the use of appropriate spacing of formulas in a page of 
text . 

Furthermore, the string of terms following a predicate 
is enclosed by the angle brackets "<" and ">" so that the 
characters "," , ";" and "-»■" can be used in the terms as 
object symbols without the use of quotation marks. 

(e) Rules for Using Well-Formed Expressions The rules for 
using well-formed productions of a canonical system are 
identical to the rules used by Smullyan. 

(f) Interpretation The interpretation given to a canonical 
system here is a hybrid of the interpretation of the 
systems of Post and Smullyan 
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(i) The productions of a canonical system are viewed 
as rewriting rules (Post), 

(ii) The derived strings of a canonical system are 

viewed as statements about the membership of n- 

tuples of strings in sets whose names are given 
by the predicates (Smullyan). 
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